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Mail Stop Petition 
Art Unit 2154 



Re: U.S . Utility Patent Application 

Application No. 09/223,046; Filed: December 30, 1998 
For: Method for Providing Extended Precision in SIMD Vector 
Arithmetic Operations 

Inventors: Van Hook et al 

Our Ref: 0056.10US (1778.0110001) 



Sir: 



In response to the Decision Refusing Status Under 37 C.F.R. 1.47(a) mailed March 18, 
2005, the following documents are transmitted herewith for appropriate action by the U.S. Patent 
and Trademark Office: 



1 . Petition Fee Transmittal (PTO/SB/1 7p); 

2. Request for Reconsideration of Petition Under 37 C.F.R. § 1 .47(a); 

3. Statement of Facts in Support of Filing On Behalf of Non-Signing 
Inventor Under 37 C.F.R. § 1.47(a), including the following: 

a. Exhibit A, a copy of a letter sent to Timothy J. Van Hook on April 
25, 2005; 

b. Exhibit B, a copy of the FedEx Shipping Label for the package 
sent to Timothy J. Van Hook on April 25, 2005; 

c. Exhibit C, a copy of a self-addressed stamped envelope sent to 
Timothy J. Van Hook on April 25, 2005; 

d. Exhibit D, a copy of the original patent application specification as 
filed in the present application on December 30, 1998, and in 
parent U.S. Patent Application No. 08/947,648, on October 9, 
1997; 
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e. Exhibit E, a copy of an Amendment and Reply Under 37 C.F.R. § 
1.111 as filed on August 11, 2000, and sent to Timothy J. Van 
Hook on April 25,2005; 

f. Exhibit F, a copy of an Amendment Under 37 C.F.R. § 1.312 and 
accompanying Request to Approve Proposed Drawing Corrections 
as filed on December 5, 2000, and sent to Timothy J. Van Hook on 
April 25,2005; 

g. Exhibit G, a copy of an Amendment and Reply Under 37 C.F.R. § 
1.111 as filed on June 14, 2002, and sent to Timothy J. Van Hook 
on April 25, 2005; 

h. Exhibit H, a copy of an Amendment and Reply Under 37 C.F.R. § 
1.114 as filed on October 13, 2004, and sent to Timothy J. Van 
Hook on April 25,2005; 

i. Exhibit I, a copy of the currently pending claims for the present 
application as sent to Timothy J. Van Hook on April 25, 2005; 

j. Exhibit J, a Supplemental Declaration for U.S. Patent Application 
No. 09/223,046 as sent to Timothy J. Van Hook on April 25, 2005; 

k. Exhibit K, a copy of an email from FedEx dated April 28, 2005, 
confirming delivery of the FedEx Shipment to Timothy J. Van 
Hook on April 28, 2005; 

4. Copy of the original executed Declaration (in four parts) as filed in both 
the present application and parent U.S. Patent Application No. 
08/947,648; 

5. Credit Card Payment Form (PTO-2038) for $200.00 to cover the petition 
fee set forth in 37 C.F.R. § 1.17(g); and 



6. One (1) return postcard. 

It is respectfully requested that the attached postcard be stamped with the date of filing of 
these documents, and that it be returned to our courier. In the event that an extension of time is 
necessary to prevent abandonment of this patent application, then such extension of time is 
hereby petitioned. 
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The U.S. Patent and Trademark Office is hereby authorized to charge any fee deficiency, 
or credit any overpayment, to our Deposit Account No. 19-0036. 



Respectfully submitted, 



Sterne, Kessler, Goldstein & Fox p.l.l.c. 



Donald J. Pratharstone 
Attorney for Applicants 
Registration No. 33,876 
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IN THE UNITED STATES PATENT AND TRADEMARK OFFICE 



In re application of: 

Van Hook et ah 

Appl. No.: 09/223,046 

Filed: December 30, 1998 

For: Method for Providing Extended 
Precision in SIMD Vector 
Arithmetic Operations 



Confirmation No.: 
Art Unit: 
Examiner: 
Atty. Docket: 



2296 
2154 

Donaghue, Larry D. 
0056. 10US (1778.0110001) 



Request for Reconsideration of Petition Under 37 C.F.R. § 1^4Jj^a) 



Commissioner for Patents Mail Stop Petition 

P.O. Box 1450 

Alexandria, VA 22313-1450 

Sir: 

In response to the Decision Refusing Status Under 37 CFR 1.47(a) mailed March 18, 
2005, Applicants hereby respectfully request that the petition under 37 C.F.R. § 1.47(a) filed 
in the above-captioned patent application be reconsidered along with the documents 
submitted herewith and that status under 37 C.F.R. § 1.47(a) be accorded to the present 
application. Accordingly, in satisfaction of the requirements set forth in 37 C.F.R. § 1.47(a) 
and M.P.E.P. §§ 409.03(a), (d), and (e), the following documents are filed herewith: 

1. A copy of the original Declaration for Patent Application executed by 
inventors Henry P. Moreton, Peter Hsu, William A. Huffman, and Earl 
A. Killian as filed in the above-captioned patent application, in 
accordance with 37 C.F.R. § 1.47(a); 



A Statement of Facts in Support of Filing On Behalf of Non-Signing 
Inventor Under 37 C.F.R. § 1.47(a) from LuAnne M. DeSantis, Esq., 



along with referenced Exhibits A-K; and 
05/19/2005 SZEWDIE1 00000028 0922304G 
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3. A Credit Card Payment Form (PTO-2038) for $200.00 in payment of 

the petition fee set forth in 37 C.F.R. § 1.17(g) in accordance with 37 

C.F.R. § 1.47. 

The Decision Refusing Status Under 37 CFR 1.47(a) mailed March 18, 2005, states: 

A grantable petition under 37 C.F.R. § 1.47(a) requires: (1) 
proof that the non-signing inventor cannot be reached or 
refuses to sign the oath or declaration after having been 
presented with the application papers (specification, claims and 
drawings); (2) an acceptable oath or declaration in compliance 
with 35 U.S.C. §§ 115 and 116; (3) the petition fee; and (4) a 
statement of the last known address of the non-signing 
inventor.... 

Petitioner respectfully submits that the documents filed herewith satisfy all the 
requirements listed above and set forth in the Decision Refusing Status Under 37 CFR 
1.47(a). 

(1) Proof that the non-signing inventor cannot be reached or refuses to sign the oath or 
declaration after having been presented with the application papers 

A Statement of Facts in Support of Filing on Behalf of Non-Signing Inventor Under 
37 C.F.R. § 1.47(a), as required by M.P.E.P. §§ 409.03(a)(B) and (d), from LuAnne M. 
DeSantis, Esq., is submitted herewith along with Exhibits A-K. These documents provide 
proof of the pertinent facts that the non-signing inventor refuses to sign the Declaration after 
having been presented with the application papers. 

(2) An acceptable oath or declaration in compliance with 35 US.C. §§ 115 and 116 

The Decision Refusing Status Under 37 CFR 1.47(a) mailed March 18, 2005, states 
that "a new declaration is not required. 1 ' However, for the convenience of the Petitions 
Attorney, a copy of the original Declaration for Patent Application executed by inventors 
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Henry P. Moreton, Peter Hsu, William A. Huffman, and Earl A. Killian, with the signature 
block of the non-signing inventor Timothy J. Van Hook left blank, is provided herewith as 
filed in the above-captioned patent application in accordance with 37 C.F.R. § 1.47(a). 
Petitioner respectfully submits that the Declaration for Patent Application provided herewith 
should be considered as having been signed by all of the available joint inventors on behalf of 
the non-signing inventor in accordance with M.P.E.P. § 409.03(a)(A). 

(3) The petition fee 

A Credit Card Payment Form (PTO-2038) for $200.00 in payment of the petition fee 
set forth in 37 C.F.R. § 1.17(g) is submitted herewith in accordance with 37 C.F.R. § 1.47. 

(4) A statement of the last known address of the non-signing inventor 

Petitioner hereby states that the last known address of the non-signing inventor 

Timothy J. Van Hook as required by M.P.E.P. §§ 409.03(a)(C) and (e) appears below. 

224 Oakgrove Avenue 
Atherton, CA 94027 

Summary 

Petitioner respectfully submits that the documents filed herewith satisfy all the 
requirements in the Decision Refusing Status Under 37 CFR 1.47(a) and the requirements set 
forth in 37 C.F.R. § 1.47(a) and M.P.E.P. §§ 409.03(a), (d) and (e). Therefore, Petitioner 
requests that the present application be accorded status under 37 C.F.R. § 1.47(a). 

It is believed that no extension of time is necessary. However, if an extension of time 
is required to prevent abandonment of the present application, then such extension is hereby 
petitioned and any fee required therefore is hereby authorized to be charged to our Deposit 
Account 19-0036. 
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As noted above, a Credit Card Payment Form (PTO-2038) for $200.00 in payment of 
the fee for filing a petition under 37 C.F.R. § 1.47 as set forth in 37 C.F.R. § 1.17(g) is 
submitted herewith. The U.S. Patent and Trademark Office is hereby authorized to charge 
any fee deficiency, or credit any overpayment, to our Deposit Account No. 19-0036. 

Respectfully submitted, 

Sterne, Kessler, Gqldstein & Fox p.l.l.c. 



Date: 




Donald J. 
Attorney for Applicants 
Registration No. 33,876 



1 1 00 New York Avenue, N. W. 
Washington, D.C. 20005-3934 
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IN THE UNITED STATES PATENT AND TRADEMARK OFFICE 



In re application of: 

Van Hook et al. 

Appl.No.: 09/223,046 

Filed: December 30, 1998 

For: Method for Providing Extended 
Precision in SIMD Vector 
Arithmetic Operations 



Confirmation No. 
Art Unit: 
Examiner: 
Atty. Docket: 



2296 
2154 

Donaghue, Larry D. 
0056. 10US (1778.0110001) 



Statement Of Facts In Support of Filing On Behalf Of Non-Signing 
Inventor Under 37 C.F.R. § 1.47(a) 



Commissioner for Patents 

P.O. Box 1450 

Alexandria, VA 22313-1450 



Mail Stop Petition 



Sir: 



I, LuAnne M. DeSantis, Esq., hereby declare: 

1. I am making this statement of facts in support of filing on behalf of a 
non-signing inventor under 37 C.F.R. § 1.47(a) with regard to U.S. 
Non-Provisional Patent Application No. 09/223,046, filed December 
30, 1998 ("the '046 patent application"). 

2. I am employed at the law firm of Sterne, Kessler, Goldstein & Fox 
P.L.L.C. ("SKGF"), 1100 New York Avenue, N.W., Washington, D.C. 
20005-3934. 



3. Mr. Timothy J. Van Hook ("Mr. Van Hook") is an inventor named in 
the '046 patent application. His last known address as of April 28, 
2005, is as follows: 



224 Oakgrove Avenue 
Atherton, CA 94027 



4. The invention disclosed and/or claimed in the above-identified patent 
application was made while Mr. Van Hook was employed by Silicon 
Graphics, Inc. ("SGI"), 2011 N. Shoreline Boulevard, Mountain View, 
California, 94043-1389. The '046 patent application is now assigned to 
MIPS Technologies, Inc. ("MIPS"). The non-signing inventor is not 
currently employed at either SGI or MIPS. 
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5. The '046 patent application is a continuation of U.S. Non-Provisional 
Patent Application No. 08/947,648 ("the '648 patent application"), filed 
October 9, 1997 (now U.S. Patent No. 5,864,703, issued January 26, 
1999. 

6. According to SKGF records, a Declaration was signed for the '648 
patent application by all inventors except Mr. Van Hook. The 
Declaration was filed in both the present '046 patent application and 
the '648 parent application. A petition for status under rule 1 .47 was 
filed in the T 648 parent application. Although the face of the patent that 
issued from the '648 patent application (U.S. Patent No. 5,864,703, 
issued January 26, 1999) indicates that the application was accorded 
rule 1 .47 status, a copy of a Decision granting status under rule 1 .47 
could not be located. 

7. This Declaration issue was discovered when reviewing the f 046 patent 
application file in preparation for filing a continuation application. 

8. On the evening of June 2, 2004, I spoke with Mr. Van Hook via 
telephone regarding a similar issue in another commonly owned 
pending application in which Mr. Van Hook is named as an inventor. 
During this conversation, I asked if I could verify his mailing address. 
Mr. Van Hook verified that his mailing address was as indicated above 
in paragraph number 3. In addition, Mr. Van Hook stated that due to 
the history between his former employer and himself, he may or may 
not open any package sent to him on behalf of our client, and that he 
may or may not sign or return anything. He explained, in some detail, 
that a lawsuit related to intellectual property and/or trade secrets had 
been brought against him in the past by his former employer. The 
lawsuit has since been dropped. However, Mr. Van Hook stated that 
he would not sign anything without a release from his former employer 
stating that suit will never be brought against him again. 

9. On April 25, 2005, SKGF sent a package via Federal Express to Mr. 
Van Hook that included a letter signed by Mr. Donald J. Featherstone, 
Esq. of SKGF, a copy of the '046 patent application, copies of 
subsequent amendments made to the '046 patent application, a list of 
the currently pending claims in the ? 046 patent application, a 
Supplemental Declaration, and a stamped self-addressed return 
envelope (see EXHIBITS A- J). 

10. On April 28, 2005, SKGF received email confirmation from Federal 
Express that the package sent on April 25, 2005, to Mr. Van Hook was 
received and signed for by "A.JALPA" (see EXHIBIT K). 

11. On May 11, 2005, I telephoned Mr. Van Hook to ask if he had any 
questions regarding the package sent on April 25, 2005. He confirmed 
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that he had received the package, but again stated that he refuses to 
sign or return anything due to the conflict with his former employer. I 
have not received any further communications from Mr. Van Hook. 



I declare that all statements made herein of my own knowledge are true and that all 
statements made on information from review of the file history of the patent application are 
believed to be true, and further that these statements were made with the knowledge that 
willful false statements or the like so made are punishable by fine or imprisonment or both 
under Section 1001 of Title 18 of the United States Code, and that such willful false 
statements may jeopardize the validity of the patent application or any patent issued thereon. 



Respectfully submitted, 



Sterne, Kessler, Goldstein & Fox p.l.l.c. 



LuAnne M. DeSantis 




1 100 New York Avenue, N.W. 
Washington, D.C. 20005-3934 
(202) 371-2600 
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Writer 's Direct Number: 

(202) 772-8629 

Internet Address: 

DONF@SKGF.COM 

Via Federal Express 



Re: U.S. Patent No. 5,864,703; Issued: January 26, 1999 

U.S. Patent Application No. 09/223,046; Filed December 30, 1998 

For: Method for Providing Extended Precision in SIMD Vector Arithmetic 

Operations 
Inventors: Van Hook et al 
OurRefs: 1778.0110000 and 1778.0110001 



Dear Mr. Van Hook: 

Our law firm is handling a family of patents on which you were named as an inventor. 
This patent family specifically includes the following: 

• U.S. Patent No. 5,864,703, issued January 26, 1999; and 

• U.S. Patent Application No. 09/223,046, filed December 30, 1998. 

These documents are entitled "Method for Providing Extended Precision in SIMD Vector 
Arithmetic Operations 11 and both are based on the same patent specification. The initially filed 
patent application was owned by Silicon Graphics, Inc. However, the original patent and the 
pending application are now owned by MIPS Technologies, Inc., our client. 

When the initial patent application was filed, the law firm handling the filing requested 
status under 37 C.F.R. § 1.47. However, the request did not adequately meet the requirements 
set forth in Rule 1 .47, which allows an application to be patented even though one or more of the 
inventors refuses to sign the Declaration or cannot be reached. Because the inadequate request 
for Rule 1.47 Status in the initial application was not discovered until recently, it has been 
carried through to the subsequently filed application in this patent family. 
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We recently brought this error to the attention of the U.S. Patent and Trademark Office 
(USPTO). In order to correct it, the USPTO has directed us to contact you and request that you 
execute two new Declarations, one for each of the patent documents listed above. 

Accordingly, enclosed you will find the following documents: 

1. The original patent application specification, as filed on October 9, 1997 
(U.S. Patent Application No. 08/947,648), and December 30, 1998 (U.S. 
Patent Application No. 09/223,046), on which both patent documents are 
based; 

2. U.S. Patent No. 5,864,703, issued January 26, 1999 (from U.S. Patent 
Application No. 08/947,648, filed October 9, 1997); 

3. Copy of an Amendment and Reply Under 37 C.F.R. § 1 . 1 1 1 as filed in the 
U.S. Patent and Trademark Office on August 11, 2000, for U.S. Patent 
Application No. 09/223,046; 

4. Copy of an Amendment Under 37 C.F.R. § 1.312 and accompanying 
Request to Approve Proposed Drawing Corrections as filed in the U.S. 
Patent and Trademark Office on December 5, 2000, for U.S. Patent 
Application No. 09/223,046; 

5. Copy of an Amendment and Reply Under 37 C.F.R. § 1 . 1 1 1 as filed in the 
U.S. Patent and Trademark Office on June 14, 2002, for U.S. Patent 
Application No. 09/223,046; 

6. Copy of an Amendment and Reply Under 37 C.F.R. § 1 . 1 14 as filed in the 
U.S. Patent and Trademark Office on October 13, 2004, for U.S. Patent 
Application No. 09/223,046; 

7. Copy of the currently pending claims for U.S. Patent Application No. 
09/223,046; 

8. A Supplemental Declaration for U.S. Patent No. 5,864,703; and 

9. A Supplemental Declaration for U.S. Patent Application No. 09/223,046. 

We ask that you please review these documents, with particular attention directed toward 
the claims of each patent document. 

A Declaration for a patent application is a document that: 1) confirms each inventor's 
residence, mailing address, and citizenship; 2) certifies that each inventor contributed to at least 
one claim of the presented subject matter; 3) certifies that the specification and claims have been 
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reviewed and are understood; and 4) certifies that each inventor acknowledges the duty to 
disclose information that is material to patentability. 

Please carefully review the enclosed Supplemental Declarations and any information 
appearing thereon. Your "residence" address should be your city and state of residence, or, if 
you reside outside the United States, the city and country of residence. Your "mailing" address 
should be the (full) address at which you customarily receive mail. Either your home or business 
address is an acceptable mailing address. Please make any corrections, if necessary, in blue ink, 
and then initial and date in the margin. Once the information on the Declarations is complete 
and correct, and after your review of the additional documents listed on the previous page, please 
sign and date each Declaration in blue ink where indicated. We ask that you attend to this 
matter as soon as possible. 

For your convenience, we have provided a self-addressed, stamped return envelope for 
returning the signed Supplemental Declarations to us. Or, if you prefer not to sign the enclosed 
Declarations, feel free to use the enclosed envelope to return them to us unsigned. 

Please note that every person who signs a document submitted to the USPTO makes a 
certification under 37 C.F.R. § 10.18(b) and (c). Therefore, a copy of this rule is also enclosed 
for your review. 

Because U.S. Patent Application No. 09/223,046 has not yet issued as a patent, it is our 
obligation to remind you that a duty of disclosure continues throughout the entire patent 
application process and ends only with the actual issuance of a patent. Therefore, if you have or 
become aware of any information that might be considered material to patentability, please 
forward it to us immediately. 

We, along with our client, greatly appreciate your assistance with this matter, and look 
forward to your return of the executed Declarations. In the meantime, if you have any comments 
or questions regarding this matter, please do not hesitate to contact us. 



Very truly yours, 




Sterne, Kessler, Goldstein & Fox p.l.l.c. 



DJF/LMY:krd 
Enclosures 
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Shipping Label: Your shipment is complete 

1 . Use the Print' feature from your browser to send this page to your laser or inkjet printer. 

2. Fold the printed page along the horizontal line. 

3. Place label in shipping pouch and affix it to your shipment so that the barcode portion of the label can be read and scanned. 
Warning: Use only the printed original label for shipping. Using a photocopy of this label for shipping purposes is fraudulent 
and could result in additional billing charges, along with the cancellation of your FedEx account number. 

Use of this system constitutes your agreement to the service conditions in the current FedEx Service Guide, available on fedex.com. FedEx will not 
be responsible for any claim in excess of $1 00 per package, whether the result of loss, damage, delay, non-delivery, misdelivery, or misinformation, 
unless you declare a higher value, pay an additional charge, document your actual loss and file a timely claim. Limitations found in the current FedEx 
Service Guide apply. Your right to recover from FedEx for any loss, including intrinsic value of the package, loss of sales, income interest, profit, 
attorney's fees, costs, and other forms of damage whether direct, incidental, consequential, or special is limited to the greater of $1 00 or the 
authorized declared value. Recovery cannot exceed actual documented loss. Maximum for items of extraordinary value is $500, e.g. jewelry, 
precious metals, negotiable instruments and other items listed in our Service Guide. Written claims must be filed within strict time limits, see current 
FedEx Service Guide. 
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A METHOD FOR PROVIDING EXTENDED PRECISION IN SIMP VECTOR 

ARITHMETIC OPERATIONS 



FIELD OF THE INVENTION 

The present claimed invention relates to the field of single instruction 
multiple data (SIMD) vector process. More particularly, the present claimed 
invention relates to extended precision in SIMD vector arithmetic operations. 

BACKGROUND ART 

Today, most processors in computer systems provide a 64-bit datapath 
architecture. The 64-bit datapath allows operations such as read, write, add, 
subtract, and multiply on the entire 64 bits of data at a time. This added 
bandwidth has significantly improved performance of the processors. 

However, the data types of many real world applications do not utilize 
the full 64 bits in data processing. For example, in digital signal processing 
(DSP) applications involving audio, video, and graphics data processing, the 
light and sound values are usually represented by data types of 8, 12, 16, or 24 
bit numbers. This is because people typically are not able to distinguish the 
levels of light and sound beyond the levels represented by these numbers of 
bits. Hence, DSP applications typically require data types far less than the full 
64 bits provided in the datapath in most computer systems. 

In initial applications, the entire datapath was used to compute an 
image or sound values. For example, an 8 or 16 bit number representing a 
pixel or sound value was loaded into a 64-bit number. Then, an arithmetic 
operation, such as an add or multiply, was performed on the entire 64-bit 
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number. This method proved inefficient however, as it was soon realized 
that not all the data bits were being utilized in the process since digital 
representation of a sound or pixel requires far fewer bits. Thus, in order to 
utilize the entire datapath, a multitude of smaller numbers were packed into 
the 64 bit doubleword. 

Furthermore, much of data processing in DSP applications involve 
repetitive and parallel processing of small integer data types using loops. To 
take advantage of this repetitive and parallel data process, a number of today's 
processors implements single instruction multiple data (SIMD) in the 
instruction architecture. For instance, the Intel Pentium MMX™ chips 
incorporate a set of SIMD instructions to boost multimedia performance. 

Prior Art Figure 1 illustrates an exemplary single instruction multiple 
data instruction process. Exemplary registers, vs and vt, in a processor are of 
64-bit width. Each register is packed with four 16-bit data elements fetched 
from memory: register vs contains vs[0], vs[l], vs[2], and vs[3] and register vt 
contains vt[0], vt[l], vt[2], and vt[3]. The registers in essence contain a vector 
of N elements. To add elements of matching index, an add instruction adds, 
independently, each of the element pairs of matching index from vs and vt. 
A third register, vd, of 64-bit width may be used to store the result. For 
example, vs[0] is added to vt[0] and its result is stored into vd[0]. Similarly, 
vd[l], vd[2], and vd[3] store the sum of vs and vd elements of corresponding 
indexes. Hence, a single add operation on the 64-bit vector results in 4 
simultaneous additions on each of the 16-bit elements. On the other hand, if 
8-bit elements were packed into the registers, one add operation performs 8 
independent additions in parallel. Consequently, when a SIMD arithmetic 
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instruction, such as addition, subtraction, or multiply, is performed on the 
data in the 64-bit datapath, the operation actually performs multiple numbers 
of operations independently and in parallel on each of the smaller elements 
comprising the 64 bit datapath. 

5 

Unfortunately however, an arithmetic operation such as add and 
multiply on SIMD vectors typically increases the number of significant bits in 
the result. For instance, an addition of two n-bit numbers may result in a 
number of n+1 bits. Moreover, a multiplication of two n-bit numbers 
10 produces a number of 2n bit width. Hence, the results of an arithmetic 
operation on a SIMD vector may not be accurate to a desired significant bit. 

Furthermore, the nature of multimedia DSP applications often 
increases inaccuracies in significant bits. For example, many DSP algorithms 

15 implemented in DSP applications require a series of computations producing 
partial results that are larger or bigger, in terms of significant number of bits, 
than the final result. Since the final result does not fully account for the 
significant bits of these partial results, the final result may not accurately 
reflect the ideal result, which takes into account all significant bits of the 

20 intermediate results. 



To recapture the full significant bits in a SIMD vector arithmetic 
operation, the size of the data in bits for each individual element was typically 
boosted or promoted to twice the size of the original data in bits. Thus, for 
25 multiplication on 8-bit elements in a SIMD vector for instance, the 8-bit 
elements were converted (i.e., unpacked) into 16-bit elements containing 8 
significant bits to provide enough space to hold the subsequent product. 
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Unfortunately however, the boost in the number of data bits largely 
undermined the benefits of SIMD vector scheme by reducing the speed of an 
arithmetic operation in half. This is because the boosting of data bits to twice 
5 the original size results in half as many data elements in a register. Hence, an 
operation on the entire 64-bit datapath comprised of 16-bit elements 
accomplishes only 4 operations in comparison to 8 operations on a 64-bit 
datapath comprised of 8-bit elements. In short, boosting a data size by X-fold 
results in performance reduction of (1/X)*100 percent. As a result, instead of 
10 an effective 64-bit datapath, the effective datapath was only 32-bits wide. 

Thus, what is needed is a method and system for providing extended 
precision in SIMD vector arithmetic operations without sacrificing speed and 
performance. 
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SUMMARY OF THE INVENTION 

The present invention provides extended precision in SIMD arithmetic 
operations in a processor having a register file and an accumulator. The 
register file is comprised of a plurality of general purpose registers of N bit 
width. The size of the accumulator is preferably an integer multiple of the 
size of the general purpose registers. The preferred embodiment uses 
registers of 64 bits and an accumulator of 192 bits. The present invention first 
loads, from a memory, a first set of data elements into a first vector register 
and a second set of data elements into a second vector register. Each data 
element comprises N bits. Next, an arithmetic instruction is fetched from 
memory and is decoded. Then, a first vector register and a second vector 
register are read from the register file as specified in the arithmetic 
instruction. The present invention then executes the arithmetic instruction 
on corresponding data elements in the first and second vector registers. The 
result of the execution is then written into the accumulator. Then, each 
element in the accumulator is transformed into an N-bit width element and 
written into a third register for further operation or storage in the memory. 

In an alternative embodiment, the accumulator contains a third set of 
data elements. After the arithmetic operation between the data elements in 
the first and second vector registers, the result of the execution is added to the 
corresponding elements in the accumulator. The result of the addition is 
then written into the accumulator. Thereafter, each element in the 
accumulator is transformed into an N-bit width element and written into a 
third register for further operation or storage in the memory. 
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BRIEF DESCRIPTION OF THE DRAWINGS 

The accompanying drawings, which are incorporated in and form a 
part of this specification, illustrate embodiments of the invention and, 
together with the description, serve to explain the principles of the invention: 

5 

Prior Art Figure 1 illustrates an exemplary single instruction multiple 
data (SIMD) instruction method. 

Figure 2 illustrates an exemplary computer system of the present 
invention. 

10 Figure 3 illustrates a block diagram of an exemplary datapath including 

a SIMD vector unit (VU), a register file, and a vector load /store unit according 

to one embodiment of the present invention. 

Figure 4 illustrates a more detailed datapath architecture including the 

accumulator in accordance with the present invention. 
15 Figure 5 illustrates a flow diagram of general operation of an 

exemplary arithmetic instruction according to a preferred embodiment of the 

present invention. 

Figure 6 illustrates element select format for 4 16-bit elements in a 64- 
bit register. 

20 Figure 7 illustrates element select format for 8 8-bit elements in a 64-bit 

register. 

Figure 8 illustrates an exemplary ADDA.fmt arithmetic operation 
between elements of exemplary operand registers vs and vt. 

Figure 9 illustrates an exemplary ADDL.fmt arithmetic operation 
25 between elements of exemplary operand registers vs and vt. 
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DESCRIPTION OF THE PREFERRED EMBODIMENTS 

In the following detailed description of the present invention, 
numerous specific details are set forth in order to provide a thorough 
understanding of the present invention. However, it will be obvious to one 
skilled in the art that the present invention may be practiced without these 
specific details. In other instances well known methods, procedures, 
components, and circuits have not been described in detail so as not to 
unnecessarily obscure aspects of the present invention. 

The present invention features a method for providing extended 
precision in single-instruction multiple-data (SIMD) arithmetic operations in 
a computer system. The preferred embodiment of the present invention 
performs integer SIMD vector arithmetic operations in a processor having 64- 
bit wide datapath within an exemplary computer system described above. 
Extended precision in the SIMD arithmetic operations are supplied through 
the use of an accumulator register having a preferred width of 3 times the 
general purpose register width. Although a datapath of 64-bits is exemplified 
herein, the present invention is readily adaptable to datapaths of other 
variations in width. 

COMPUTER SYSTEM ENVIRONMENT 
Figure 2 illustrates an exemplary computer system 212 comprised of a 
system bus 200 for communicating information, one or more central 
processors 201 coupled with the bus 200 for processing information and 
instructions, a computer readable volatile memory unit 202 (e.g., random 
access memory, static RAM, dynamic RAM, etc.) coupled with the bus 200 for 
storing information and instructions for the central processor(s) 201, a 
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computer readable non-volatile memory unit (e.g., read only memory, 
programmable ROM, flash memory, EPROM, EEPROM, etc.) coupled with the 
bus 200 for storing static information and instructions for the processor(s). 

5 Computer system 212 of Figure 2 also includes a mass storage computer 

readable data storage device 204 (hard drive, floppy, CD-ROM, optical drive, 
etc.) such as a magnetic or optical disk and disk drive coupled with the bus 200 
for storing information and instructions. Optionally, system 212 can include 
a display device 205 coupled to the bus 200 for displaying information to the 

10 user, an alphanumeric input device 206 including alphanumeric and 

function keys coupled to the bus 200 for communicating information and 
command selections to the central processor(s) 201, a cursor control device 207 
coupled to the bus for communicating user input information and command 
selections to the central processor(s) 201, and a signal generating device 208 

15 coupled to the bus 200 for communicating command selections to the 
processor(s) 201. 

According to a preferred embodiment of the present invention, the 
processor(s) 201 is a SIMD vector unit which can function as a coprocessor for 

20 a host processor (not shown). The VU performs arithmetic and logical 
operations on individual data elements within a data word using the 
instruction methods described below. Data words are treated as vectors of 
Nxl elements, where N can be 8, 16, 32, 64, or multiples thereof. For example, 
a set of Nxl data elements of either 8- or 16-bit fields comprises a data 

25 doubleword of 64-bit width. Hence, a 64 bit wide double word contains either 
4 16-bit elements or 8 8-bit elements. 
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Figure 3 illustrates a block diagram of an exemplary datapath 300 
including a SIMD vector unit (VU) 302, a register file 304, a vector load /store 
unit 318, and crossbar circuits 314 and 316 according to one embodiment of the 
present invention. The VU 302 executes an operation specified in the 
instruction on each element within a vector in parallel. The VU 302 can 
operate on data that is the full width of the local on-chip memories, up to 64 
bits. This allows parallel operations on 8 8-bit, 4 16-bit, 2 32-bit, or 1 64-bit 
elements in one cycle. The VU 302 includes an accumulator 312 to hold 
values to be accumulated or accumulated results. 

The vector register file is comprised of 32 64-bit general purpose 
registers 306 through 310. The general purpose registers 306 through 310 are 
visible to the programmer and can be used to store intermediate results. The 
preferred embodiment of the present invention uses the floating point 
registers (FGR) of a floating point unit (FPU) as its vector registers. 

In this shared arrangement, data is moved between the vector register 
file 304 and memory with Floating Point load and store doubleword 
instructions through the vector load/store unit 318. These load and store 
operations are unformatted. That is, no format conversions are performed 
and therefore no floating-point exceptions can occur due to these operations. 
Similarly, data is moved between the vector register file 304 and the VU 302 
without format conversions, and thus no floating-point exception occurs. 

Within each register, data may be written, or read, as bytes (8-bits), 
short-words (16-bits), words (32-bits), or double-words (64-bits). Specifically, 
the vector registers of the present invention are interpreted in the following 
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new data formats: Quad Half (QH), Oct Byte (OB), Bi Word (BW), and Long 
(L). In QH format, a vector register is interpreted as having 16-bit elements. 
For example, a 64-bit vector register is interpreted as a vector of 4 signed 16-bit 
integers. OB format interprets a vector register as being comprised of 8-bit 
elements. Hence, an exemplary 64-bit vector register is seen as a vector of 8 
unsigned 8-bit integers. In BW format, a vector register is interpreted as 
having 32-bit elements. L format interprets a vector register as having 64-bit 
elements. These data types are provided to be adaptable to various register 
sizes of a processor. As described above, data format conversion is not 
necessary between these formats and floating-point format. 

With reference to Figure 3, the present invention utilizes crossbar 
circuits to select and route elements of a vector operand. For example, the 
crossbar circuit 314 allows selection of elements of a given data type and pass 
on the selected elements as operands to VU 302. The VU 302 performs 
arithmetic operations on operands comprised of elements and outputs the 
result to another crossbar circuit 316. This crossbar circuit 316 routes the 
resulting elements to corresponding element fields in registers such as vd 310 
and accumulator 312. Those skilled in the art will no doubt recognize that 
crossbar circuits are routinely used to select and route the elements of a vector 
operand. 



With reference to Figure 3, the present invention also provides a 
special register, accumulator 312, of preferably 192-bit width. This register is 
used to store intermediate add, subtract, or multiply results generated by one 
instruction with the intermediate add, subtract, or multiply results generated 
by either previous or subsequent instructions. The accumulator 312 can also 
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be loaded with a vector, of elements from memory through a register. In 
addition, the accumulator 312 is capable for forwarding data to the VU 302, 
which executes arithmetic instructions. Although the accumulator 312 is 
shown to be included in the VU 302, those skilled in the art will recognize 
5 that it can also be placed in other parts of the datapath so as to hold either 
accumulated results or values to be accumulated. 



Figure 4 illustrates a more detailed datapath architecture including the 
accumulator 312. In this datapath, the contents of two registers, vs and vt, are 

10 operated on by an ALU 402 to produce a result. The result from the ALU can 
be supplied as an operand to another ALU such as an adder/subtractor 404. In 
this datapath configuration, the accumulator 312 can forward its content to be 
used as the other operand to the adder/subtractor 404. In this manner, the 
accumulator 312 can be used as both a source and a destination in consecutive 

15 cycles without causing pipe stalls or data hazards. By thus accumulating the 
intermediate results in its expanded form in tandem with its ability to be used 
as both a source and a destination, the accumulator 312 is used to provide 
extended precision for SIMD arithmetic operations. 

20 An exemplary accumulator of the present invention is larger in size 

than general purpose registers. The preferred embodiment uses 192-bit 
accumulator and 64-bit registers. The format of the accumulator is 
determined by the format of the elements accumulated. That is, the data 
types of an accumulator matches the data type of operands specified in an 

25 instruction. For example, if the operand register is in QH format, the 

accumulator is interpreted to contain 4 48-bit elements. In OB format, the 
accumulator is seen as having 8 24-bit elements. In addition, accumulator 
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elements are always signed. Elements are stored from or loaded into the 
accumulator indirectly to and from the main memory by staging the elements 
through the shared Floating Point register file. 

Figure 5 illustrates a flow diagram of an exemplary arithmetic 
operation according to a preferred embodiment of the invention. In step 502, 
an arithmetic instruction is fetched from memory into an instruction register. 
Then in step 504, the instruction is decoded to determine the specific 
arithmetic operation, operand registers, selection of elements in operand 
registers, and data types. The instruction opcode specifies an arithmetic 
operation such as add, multiply, or subtract in its opcode field. The 
instruction also specifies the data type of elements, which determines the 
width in bits and number of elements involved in the arithmetic operation. 
For example, OB data type format instructs the processor to interpret a vector 
register as containing 8 8-bit elements. On the other hand, QH format directs 
the processor to interpret the vector register as having 4 16-bit elements. 

The instruction further specifies two operand registers, a first register 
(vs) and a second register (vt). The instruction selects the elements of the 
second register, vt, to be used with each element of the accumulator, and /or 
the first register, vs. For example, the present invention allows selection of 
one element from the second register to be used in an arithmetic operation 
with all the elements in the first register independently and in parallel. The 
selected element is replicated for every element in the first register. In the 
alternative, the present invention provides selection of all elements from the 
second register to be used in the arithmetic operation with all the elements in 
the first register. The arithmetic operation operates on the corresponding 
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elements of the registers independently and in parallel. The present 
invention also provides an immediate value (i.e., a constant) in a vector field 
in the instruction. The immediate value is replicated for every element of 
the second register before an arithmetic operation is performed between the 
first and second registers. 

According to the decoded instruction, the first register and the second 
register with the selected elements are read for execution of the arithmetic 
operation in step 506. Then in step 508, the arithmetic operation encoded in 
the instruction is executed using each pair of the corresponding elements of 
first register and the second register as operands. The resulting elements of 
the execution are written into corresponding elements in the accumulator in 
step 510. According to another embodiment of the present invention, the 
resulting elements of the execution are added to the existing values in the 
accumulator elements. That is, the accumulator "accumulates" (i.e., adds) the 
resulting elements onto its existing elements. The elements in the 
accumulator are then transformed into N-bit width in step 512. Finally, in 
step 514, the transformed elements are stored into memory. The process then 
terminates in step 516. 

The SIMD vector instructions according to the present invention either 
write all 192 bits of the accumulator or all 64 bits of an FPR, or the condition 
codes. Results are not stored to multiple destinations, including the 
condition codes. 

Integer vector operations that write to the FPRs clamp the values being 
written to the target's representable range. That is, the elements are saturated 
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for overflows and under flows. For overflows, the values are clamped to the 
largest representable value. For underflows, the values are clamped to the 
smallest representable value. 

On the other hand, integer vector operations that write to an 
accumulator do not clamp their values before writing, but allow underflows 
and overflows to wrap around the accumulator's representable range. Hence, 
the significant bits that otherwise would be lost are stored into the extra bits 
provided in the accumulator. These extra bits in the accumulator thus ensure 
that unwanted overflows and underflows do not occur when writing to the 
accumulator or FPRs. 

SELECTION OF VECTOR ELEMENTS 
The preferred embodiment of the present invention utilizes an 
accumulator register and a set of vector registers in performing precision 
arithmetic operations. First, an exemplary vector register, vs, is used to hold a 
set of vector elements. A second exemplary vector register, vt, holds a 
selected set of vector elements for performing operations in conjunction with 
the elements in vector register, vs. The present invention allows an 
arithmetic instruction to select elements in vector register vt for operation 
with corresponding elements in other vector registers through the use of a 
well known crossbar method. A third exemplary vector register, vd, may be 
used to hold the results of operations on the elements of the registers 
described above. Although these registers (vs, vt, and vd) are used to 
associate vector registers with a set of vector elements, other vector registers 
are equally suitable for present invention. 
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To perform arithmetic operations on desired elements of a vector, the 
present invention uses a well known crossbar method adapted to select an 
element of the vector register, vt, and replicate the element in all other 
element fields of the vector. That is, an element of vt is propagated to all 
other elements in the vector to be used with each of the elements of the other 
vector operand. Alternatively, all the elements of the vector, vt, may be 
selected without modification. Another selection method allows an 
instruction to specify as an element an immediate value in the instruction 
opcode vector field corresponding to vt and replicate the element for all other 
elements of vector vt. These elements thus selected are then passed onto the 
VU for arithmetic operation. 

Figure 6 illustrates element select format for 4 16-bit elements in a 64- 
bit register. The exemplary vector register vt 600 is initially loaded with four 
elements: A, B, C, and D. The present invention allows an instruction to 
select or specify any one of the element formats as indicated by rows 602 
through 610. For example, element B for vt 600 may be selected and replicated 
for all 4 elements as shown in row 604. On the other hand the vt 600 may be 
passed without any modification as in row 610. 

Figure 7 illustrates element select format for 8 8-bit elements in a 64-bit 
register. The exemplary vector register vt 700 is initially loaded with eight 
elements: A, B, C, D, E, F, G, and H. The present invention allows an 
instruction to select or specify any one of the element formats as indicated by 
rows 702 through 718. For example, element G for vt 700 may be selected and 
replicated for all 8 elements as shown in row 714. On the other hand, the vt 
700 may be passed without any modification as in row 718. 
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ARITHMETIC INSTRUCTIONS 
In accordance with the preferred embodiment of the present invention, 
arithmetic operations are performed on the corresponding elements of vector 
5 registers. The instruction is fetched from main memory and is loaded into a 
instruction register. It specifies the arithmetic operation to be performed. 

In the following arithmetic instructions, the operands are values in 
integer vector format. The accumulator is in the corresponding accumulator 

10 vector format. The arithmetic operations are performed between elements of 
vectors occupying corresponding positions in the vector field in accordance 
with SIMD characteristics of the present invention. For example, an add 
operation between vs and vt actually describes eight parallel add operations 
between vs[0] and vt[0] to vs[7] and vt[7]. After an arithmetic operation has 

15 been performed but before the values are written into the accumulator, a 
wrapped arithmetic is performed such that overflows and underflows wrap 
around the Accumulator's representable range. 

Accumulate Vector Add (ADDA.fmt). In the present invention 
20 ADDA.fmt instruction, the elements in vector registers vt and vs are added to 
those in the Accumulator. Specifically, the corresponding elements in vector 
registers vt and vs are added. Then, the elements of the sum are added to the 
corresponding elements in the accumulator. Any overflows or underflows in 
the elements wrap around the accumulator's representable range and then 
25 are written into the accumulator. 
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Figure 8 illustrates an exemplary ADDA.fmt arithmetic operation 
between elements of operand registers vs 800 and vt 802. Each of the registers 
800, 802, and 804 contains 4 16-bit elements. Each letter in the elements (i.e., 
A, B, C, D, E, F, G, H, and I) stands for a binary number. FFFF is an 
5 hexadecimal representation of 16-bit binary number, 1111 1111 1111 1111. The 
vs register 800 holds elements FFFF, A, B, and C. The selected elements of vt 
registers are FFFF, D, E, and F. The ADDA.fmt arithmetic instruction directs 
the VU to add corresponding elements: FFFF+FFFF (=1FFFD), A+D, B+E, and 
C+F. Each of these sums are then added to the corresponding existing 

10 elements (i.e., FFFF, G, H, and I) in the accumulator 804: FFFF+1FFFD, 
A+D+G, B+E+H, and C+F+I. The addition of the hexadecimal numbers, 
1FFFD and FFFF, produces 2FFFC, an overflow condition for a general 
purpose 64-bit register. The accumulator's representable range is 48 bits in 
accordance with the present invention. Since this is more than enough bits 

15 to represent the number, the entire number 2FFFC is written into the 
accumulator. As a result, no bits have been lost in the addition and 
accumulation process. 

Load Vector Add (ADDL.fmt). According to the ADDL.fmt instruction, 
20 the corresponding elements in vectors vt and vs are added and then stored 
into corresponding elements in the accumulator. Any overflows or 
underflows in the elements wrap around the accumulator's representable 
range and then are written into the accumulator 706, 

25 Figure 9 illustrates an exemplary ADDL.fmt arithmetic operation 

between elements of operand registers vs 900 and vt 902. Each of the registers 
900, 902, and 904 contains 4 16-bit elements. Each letter in the elements (i.e., 
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A, B, C, D, E, and F) stands for a binary number. FFFF is an hexadecimal 
representation of 16-bit binary number, 1111 1111 1111 1111. The vs register 
900 holds elements FFFF, A, B, and C. The selected elements of vt registers 
are FFFF, D, E, and F. The ADDA.fmt arithmetic instruction instructs the VU 
5 to add corresponding elements: FFFF+FFFF , A+D, B+E, and C+F. The 
addition of hexadecimal numbers, FFFF and FFFF, produces 1FFFD, a 
technical overflow condition for a general purpose 64-bit register. The 
present invention wraps the number 1FFFD around the accumulator's 
representable range, which is 48 bits. Since this is more than enough bits to 
10 represent the number, the entire number 1FFFD is written into the 

accumulator. As a result, no bits have been lost in the addition process. 

Accumulate Vector Multiply (MULA.fmt). The MULA.fmt instruction 
multiplies the values in vectors vt and vs. Then the product is added to the 
15 accumulator. Any overflows or underflows in the elements wrap around the 
accumulator's representable range and then are written into the accumulator. 

Add Vector Multiply to Accumulator (MULL.fmt). The MULL.fmt 
instruction multiplies the values in vectors vt and vs. Then, the product is 
20 written to the accumulator. Any overflows or underflows in the elements 
wrap around the accumulator's representable range and then are written into 
the accumulator. 

Subtract Vector Multiply from Accumulator (MULS.fmt). In 
25 MULS.fmt instruction, the values in vector vt are multiplied by the values in 
vector vs, and the product is subtracted from the accumulator. Any 
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overflows or underflows in the elements wrap around the accumulator's 
representable range and then are written into the accumulator. 

Load Negative Vector Multiply (MULSL.fmt). The MULSL.fmt 
5 instruction multiplies the values in vector vt with the values in vector vs. 
Then, the product is subtracted from the accumulator. Any overflows or 
underflows in the elements wrap around the accumulator's representable 
range and then are written into the accumulator. 

10 Accumulate Vector Difference (SUBA.fmt). The present SUBA.fmt 

instruction computes the difference between vectors vt and vs. Then, it adds 
the difference to the value in the accumulator. Any overflows or underflows 
in the elements wrap around the accumulator's representable range and then 
are written into the accumulator. 

15 

Load Vector Difference (SUBL.fmt). According to SUBL.fmt 
instruction, the differences of vectors vt and vs are written into those in the 
accumulator. Any overflows or underflows in the elements wrap around the 
accumulator's representable range and then are written into the accumulator. 

20 

ELEMENT TRANSFORMATION 
After an arithmetic operation, the elements in the accumulator are 
transformed into the precision of the elements in the destination registers for 
further processing or for eventual storage into a memory unit. During the 
25 transformation process, the data in each accumulator element is packed to the 
precision of the destination operand. The present invention provides the 
following instruction method for such transformation. 
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Scale, Round and Clamp Accumulator (Rx.fmt). According to Rx.fmt 
instruction, the values in the accumulator are shifted right by the values 
specified in a vector field vt in the instruction opcode. This variable shift 
supports application or algorithm specific fixed point precision. The vt 
operands are values in integer vector format. The accumulator is in the 
corresponding accumulator vector format. 

Then, each element in the accumulator is rounded according to a mode 
specified by the instruction. The preferred embodiment of the invention 
allows three rounding modes: 1) round toward zero, 2) round to nearest with 
exactly halfway rounding away from zero, and 3) round to nearest with 
exactly halfway rounding to even. These rounding modes minimize 
truncation errors during arithmetic process. 

The elements are then clamped to either a signed or unsigned range of 
an exemplary destination vector register, vd. That is, the elements are 
saturated to the largest representable value for overflow and the smallest 
representable value for underflow. Hence, the clamping limits the resultant 
values to the minimum and maximum precision of the destination elements 
without overflow or underflow. 

SAVING ACCUMULATOR STATF. 
Since the vector accumulator is a special register, the present invention 
allows the contents of the accumulator to be saved in a general register. 
However, because the size of the elements of the accumulator is larger than 
the elements of general purpose registers, the transfer occurs in multiple 
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chunks of constituent elements. The following instructions allow storage of 
the accumulator state. 

Read Accumulator (RAC.fmt). The RAC.fmt instruction reads a 
portion of the accumulator elements, preferably a third of the bits in 
elements, and saves the elements into a vector register. Specifically, this 
instruction method allows the least significant, middle significant, or most 
significant third of the bits of the accumulator elements to be assigned to a 
vector register such as vd. In this operation, the values extracted are not 
clamped. That is, the bits are simply copied into the elements of vector 
register, vd. 

Write Accumulator High (WACH.fmt). The WACH.fmt instruction 
loads portions of the accumulator from a vector register. Specifically, this 
instruction method writes the most significant third of the bits of the 
accumulator elements from a vector register such as vs. The least significant 
two thirds of the bits of the accumulator are not affected by this operation. 

Write Accumulator Low (WACL.fmt). According to WACL.fmt 
instruction, the present invention loads two thirds of the accumulator from 
two vector registers. Specifically, this instruction method writes the least 
significant two thirds of the bits of the accumulator elements. The remaining 
upper one third of the bits of the accumulator elements are written by the 
sign bits of the corresponding elements of a vector register such as vs, 
replicated by 16 or 8 times, depending on the data type format specified in the 
instruction. 
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A RACL/RACM/RACH instruction followed by WACL/WACH are 
used to save and restore the accumulator. This save/ restore function is 
format independent, either format can be used to save or restore accumulator 
values generated by either QH or OB operations. Data conversion need not 
occur. The mapping between element bits of the OB format accumulator and 
bits of the same accumulator interpreted in QH format is implementation 
specific, but consistent for each implementation. 

The present invention, a method for providing extended precision in 
SIMD vector arithmetic operations, utilizes an accumulator register. While 
the present invention has been described in particular embodiments, it 
should be appreciated that the present invention should not be construed as 
being limited by such embodiments, but rather construed according to the 
claims below. 
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CLAIMS 

What is claimed is: 

1. In a computer system including a processor which contains a 
first set of N-bit data elements loaded into a first register and a second set of 
5 N-bit data elements loaded into a second register, a method for providing 
extended precision in single instruction multiple data (SIMD) arithmetic 
operations, comprising the steps of: 

fetching an arithmetic instruction from a memory unit; 
decoding the arithmetic instruction and reading the first vector register 
10 and the second vector register; 

executing the arithmetic instruction on corresponding N-bit data 
elements in the first register and second register to produce corresponding 
resulting elements; 

writing the resulting elements into corresponding elements of an 
15 accumulator; 

transforming the each resulting element in the accumulator into N- 
bits; and 

writing the transformed elements of N-bit width into a third register. 



20 



2. The method as recited in Claim 1, wherein said decoding step 

further comprises the steps of: 

selecting an element from the second register; and 

copying the selected element into the other elements in the second 

register. 



25 
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3. The method as recited in Claim 1, wherein said arithmetic 
instruction is an addition of corresponding vector elements in the first and 
second vector registers. 



instruction is a multiplication of corresponding vector elements in the first 
and second vector registers. 

5. The method as recited in Claim 1, wherein said arithmetic 
10 instruction is a subtraction of second vector register elements from the first 
vector register elements. 



5 



4. 



The method as recited in Claim 1, wherein said arithmetic 



6. The method as recited in Claim 1, wherein said accumulator is a 
register having an integer multiples of 64-bit width. 



15 



The method as recited in Claim 1, wherein said accumulator is a 



register of 192-bits. 



The method as recited in Claim 1, wherein said transformation 



20 step further comprises the steps of: 



scaling the resulting elements in the accumulator by shifting the 



values in the resulting elements; 

rounding the scaled resulting elements in the accumulator; and 
clamping the rounded resulting elements. 



25 



9. The method as recited in Claim 1, wherein said third register 
writing step further comprises the steps of: 
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reading a portion of the accumulator elements; and 
writing the portion of the accumulator elements into the 
corresponding elements of said third register. 

10. The method as recited in Claim 9, wherein the portion is either 
the low third bits or the high third bits of the elements in the accumulator. 

11. The method as recited in Claim 1, wherein the values in the 
resulting elements are wrapped around the representable range of the 
accumulator elements. 

12. The method as recited in Claim 1, wherein the data elements are 
integers. 

13. The method as recited in Claim 1, wherein the first register, the 
second register, and the third registers are floating point registers. 

14. The method as recited in Claim 1, wherein the first register, the 
second register, and the third register are each 64-bit wide. 

15. The method as recited in Claim 1, wherein N is 8. 

16. The method as recited in Claim 1, wherein N is 16. 

17. The method as recited in Claim 15, wherein the elements in the 
accumulator are each 24 bit wide. 



n 
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18. The method as recited in Claim 16, wherein the elements in the 
accumulator are each 48 bit wide. 

19. The method as recited in Claim 1, wherein said third register 
writing step further comprises the steps of: 

reading a portion of the accumulator elements; and 
writing the portion of the accumulator elements into the 
corresponding elements of said third register. 

20. The method as recited in Claim 19, wherein the portion is 
chosen from the low third bits, the middle third bits, or the high third bits of 
the elements in the accumulator. 

21. In a computer system including a processor which contains a 
first set of N-bit data elements loaded into a first register, a second set of N-bit 
data elements loaded into a second register, and an accumulator having a 
third set of data elements, a method for providing extended precision in 
single instruction multiple data (SIMD) arithmetic operations, comprising the 
steps of: 

fetching an arithmetic instruction from a memory unit; 

decoding the arithmetic instruction and reading the first vector register 
and the second vector register; 

executing the arithmetic instruction on corresponding data elements in 
the first and second vector registers to produce corresponding resulting 
elements; 

adding the resulting elements to the corresponding elements in the 
accumulator; 
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writing the resulting elements into the accumulator; 
transforming the each resulting element in the accumulator into an N- 
bit width element; and 

writing the transformed elements of N-bit width into a third register. 

22. The method as recited in Claim 21, wherein said decoding step 
further comprises the steps of: 

selecting an element from the second register; and 
copying the selected element into the other elements in the second 
register. 

23. The method as recited in Claim 21, wherein said arithmetic 
instruction is an addition of corresponding vector elements in the first and 
second vector registers. 

24. The method as recited in Claim 21, wherein said arithmetic 
instruction is a multiplication of corresponding vector elements in the first 
and second vector registers. 

25. The method as recited in Claim 21, wherein said arithmetic 
instruction is a subtraction of second vector register elements from the first 
vector register elements. 

26. The method as recited in Claim 21, wherein said accumulator is 
a register having an integer multiples of 64-bit width. 
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27. The method as recited in Claim 21, wherein said accumulator is 
a register of 192-bits. 

28. The method as recited in Claim 21, wherein said transformation 
5 step further comprises the steps of: 

scaling the resulting elements in the accumulator by shifting the 
values in the resulting elements; 

rounding the scaled resulting elements in the accumulator; and 
clamping the rounded resulting elements. 

10 

29. The method as recited in Claim 21, wherein said third register 
writing step further comprises the steps of: 

reading a portion of the accumulator elements; and 
writing the portion of the accumulator elements into the 
15 corresponding elements of said third register. 

30. The method as recited in Claim 29, wherein the portion is either 
the low third bits or high third bits of the elements in the accumulator. 

20 31. The method as recited in Claim 21, wherein the values in the 

resulting elements are wrapped around the representable range of the 
accumulator elements. 

32. The method as recited in Claim 21, wherein the data elements 
25 are integers. 
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33. The method as recited in Claim 21, wherein the first register, the 
second register, and the third registers are floating point registers. 

34. The method as recited in Claim 21, wherein the first register, the 
second register, and the third register are each 64-bit wide. 

35. The method as recited in Claim 21, wherein N is 8. 

36. The method as recited in Claim 21, wherein N is 16. 

37. The method as recited in Claim 35, wherein the elements in the 
accumulator are each 24 bit wide. 

38. The method as recited in Claim 36, wherein the elements in the 
accumulator are each 48 bit wide. 

39. The method as recited in Claim 21, wherein said third register 
writing step further comprises the steps of: 

reading a portion of the accumulator elements; and 
writing the portion of the accumulator elements into the 
corresponding elements of said third register. 

40. The method as recited in Claim 39, wherein the portion is 
chosen from the low third bits, the middle third bits, or the high third bits of 
the elements in the accumulator. 
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ABSTRACT 

The present invention provides extended precision in SIMD arithmetic 
operations in a processor having a register file and an accumulator. A first set 
of data elements and a second set of data elements are loaded into a first 
vector register and a second vector register, respectively. Each data element 
comprises N bits. Next, an arithmetic instruction is fetched from memory. 
The arithmetic instruction is decoded. Then, a first vector register and a 
second vector register are read from the register file. The present invention 
then executes the arithmetic instruction on corresponding data elements in 
the first and second vector registers. The result of the execution is then 
written into the accumulator. Then, each element in the accumulator is 
transformed into an N-bit width element and stored into the memory. 



SGI-15-4-458.00 



30 



October 6, 1997 



o 



63 48 47 32 31 16 15 0 



vs[3] 


vs[2] 


vs[l] 


vs[0] 


+ 


+ 


+ 


+ 


vt[3] 


vt[2] 


vt[l] 


vt[0] 


i 


i 


i 


i 


vs[3]+vt[3] 


vs[2]+vt[2] 


vs[l]+vt[l] 


vs[0]+vt[0] 



FIG. 1 

(Prior Art) 



f) 



Tj 
O 

■ 

ro 



D 

m 
< 

o 
m 

ro 
o 
01 



05 

-0 
r— 
> 
-< 



O 

"D 

5" 
3 



^ > 



Z 
"0 

c 

H 

ro 
o 
o 



"D 
I 
> 
Z 
C 

m 
o 



o 

5' 
fi) 



o o 
o c 

Z 3 

H 05 

2° 
O 33 
I - 

ro 
o 



O 
•o 

o" 

3 

09 



o 
o 



ro 
o 
00 



"0 

c 

H 

O 
C 
H 
"0 

c 



o 

5' 
3 




O 
O 

■a 
c 

H 

m 

05 

-< 

05 
H 

m 

ro 
ro 





D CO O 




m h > 




< 0 ■ d 




O 3J > 


0 


m 







Fetch arithmetic 
instruction 



Decode the arithmetic 
instruction 



I 



Read operand registers 

i 



-Sob 



Execute the arithemtic 

instruction on each 
element of the operands 



-so2> 



I 



Write the resulting 
elements into accumulator 



-sio 



Transform the resulting 
elements into N-bits 



Store the transformed 
elements into memory 



c 



Done 



FIG. 5 



f) 



ON 



n 



cd 



n 



n 



n 



n 



od 



cd 



g ^ ^ ^ ^ 



ON 



00 



n 



00 



Dd 



ON 

»— > 
CJ1 



on 



> 

H 
> 
H 

w 
tn 
r 
w 

S 

w 

Z 

H 
CO 

w 

m 
n 

H 

O 



4h 


T 1 
iJU 


C J 


■ . i 




O 


n 


CO 


►> 


u J 


T 1 
hJU 


u J 


Tl 


m 




r j 


r-rl 




hrl 


T 1 


r\ 

u J 


HH 

TJ 


l-H 

UJ 


r— i 


f j 






m 


S5 






w 


a 


o 


Dd 


> 


D 




o 






a 


n 


cd 


> 


o 




o 




m 


a 


n 


do 


> 


cd 








m 


a 


n 


ad 


> 


> 


X 


o 






o 


n 


Dd 


> 



ON 

00 



0» ^ " K> 5) C» On ^ >o 



ON 

On 
On 



4^ 

00 
^3 



o 



CO 

ro 

00 



N3 
OO 



n 

Dd 

> 
T 

i 



On 

on 
oo 




! 



r 



Exhibit E 



IN THE UNITED STATES PATENT AND TRADEMARK OFFICE 



In re application of: 



Hsu et al 



Art Unit: 2783 



Appl.No.: 09/223,046 

Filed: December 30, 1998 

For: Method for Providing Extended 



Examiner: L. Donaghue 
Atty. Docket: 0056. 10US 



Precision in SIMD Vector 



Arithmetic Operations 



Amendment and Reply under 37 C.F.R. §1.111 



Commissioner for Patents 
Washington, D.C. 20231 

Sir: 

In reply to the Office Action dated February 11, 2000, (PTO Prosecution File Wrapper 
Paper No. 15), Applicants submit the following Amendment and Remarks. 

It is not believed that extensions of time or fees for net addition of claims are required 
beyond those that may otherwise be provided for in documents accompanying this paper. 
However, if additional extensions of time are necessary to prevent abandonment of this 
application, then such extensions of time are hereby petitioned under 37 C.F.R. § 1 .136(a), and 
any fees required therefor (including fees for net addition of claims) are hereby authorized to be 
charged to our Deposit Account No. 19-0036. 



In the Claims: 

Please cancel claim 1 without prejudice to or disclaimer of the subject matter contained 

therein. 

Please add the following new claims 41-78: — - 

—41. A computer-based method for providing extended precision in single instruction multiple 
data (SIMD) arithmetic operations, comprising the steps of: 
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3 (a) loading a first vector into a first register, said first vector comprising a 

4 plurality of N-bit elements; 

5 (b) loading a second vector into a second register, said second vector 

6 comprising a plurality of N-bit elements; 

7 (c) executing an arithmetic instruction for at least one pair consisting of an 

8 N-bit element in said first register and an N-bit element in said second register, to produce a 

9 resulting element; 

10 (d) writing said resulting element into an M-bit element of an accumulator, 

1 1 wherein M is greater than N; 

12 (e) transforming said resulting element in said accumulator into a width of 

13 N-bits; and 

14 (f) writing said resulting element into a third register. 

1 42. The method as recited in claim 4 1 , wherein said accumulator comprises a plurality of M- 

2 bit elements and wherein steps (c)-(f) operate on a plurality of elements of said first and second 

3 vectors to produce a resultant vector formed from a plurality of resulting elements written to said 

4 third register. 

1 43. The method as recited in claim 42, further comprising a step before step (c) of: 

2 selecting an element from said second register; and 

3 copying said element into all other elements in said second register. 

1 44. The method as recited in claim 42, further comprising a step before step (f) of: 

2 selecting a subset of said resulting elements in said accumulator for writing to said 

3 third register, said subset being chosen from any one of: the low third bits, the middle third bits, 

4 and the high third bits of said resulting elements in said accumulator. 

1 45. The method as recited in claim 42, wherein M is equal to three times N. 



1 



The method as recited in claim 45, wherein N is equal to eight or sixteen. 
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1 47. The method as recited in claim 42, wherein said resulting elements in said accumulator 

2 are wrapped around the representable range of said resulting elements. 

1 48. The method as recited in claim 42, further comprising a step before step (f) of: 

2 dividing said resulting elements stored in said accumulator into a plurality of 

3 subsets; 

4 writing each subset to at least one of a plurality of registers, each of said plurality 

5 of registers having a width smaller than said accumulator width. 

1 49 . The method as recited in claim 4 1 , wherein said loading step (a) and said loading step (b) 

2 are not formatted. 

1 50. The method as recited in claim 41, further comprising a step before step (d) of: 

2 formatting said resulting element as specified in said arithmetic instruction. 

1 51. The method as recited in claim 41, wherein said arithmetic instruction is any one of: 

2 addition, multiplication and subtraction. 

1 52. The method as recited in claim 41, wherein step (e) comprises the steps of: 

2 shifting said resulting element in said accumulator for scaling the value of said 

3 resulting element; 

4 rounding said resulting element; and 

5 clamping said resulting element. 

1 53. The method as recited in claim 52, wherein said rounding step comprises one of: 

2 rounding said resulting element towards zero; 

3 rounding said resulting element towards the nearest unit, wherein said resulting 

4 element is rounded away from zero if said resulting element is at least halfway towards the 

5 nearest unit; and 

6 rounding said resulting element towards the nearest unit, wherein said resulting 

7 element is rounded towards zero if said resulting element is at least halfway towards the nearest 

8 unit. 
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1 54. The method as recited in claim 41, further comprising a step before step (d) of: 

2 adding an element previously stored in said accumulator to said resulting element. 

1 55 . The method as recited in claim 4 1 , wherein N is any one of: eight, sixteen, thirty-two and 

2 sixty-four. 

1 56. The method as recited in claim 55, wherein said N-bit elements are integers. 

1 57. The method as recited in claim 55, wherein each of said first and second vectors has a 

2 width of 64 bits. 

1 58. The method as recited in claim 57, wherein said accumulator is a register having a width 

2 equal to an integer multiple of 64 bits. 

1 59. The method as recited in claim 58, wherein said accumulator is a register having a width 

2 of 192 bits. 

1 60. The method as recited in claim 41, wherein said first register, said second register, and 

2 said third register are floating point registers. 

1 61. The method as recited in claim 41, wherein said first register, said second register, and 

2 said third register each have a width of 64-bits. 

1 62. A processor for providing extended precision in single instruction multiple data (SIMD) 

2 arithmetic operations, comprising: 

3 means for executing an arithmetic instruction involving an element of a first 

4 vector and an element of a second vector to produce a resulting element said first and second 

5 vector comprising a plurality of N-bit elements; 

6 an accumulator for receiving said resulting element, wherein said resulting 

7 element is stored in an M-bit element of said accumulator and wherein M is greater than N; 
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means for transforming said resulting element in said accumulator into a width 

of N-bits; and 

means for writing said transformed resulting element to a register. 

63. The processor as recited in claim 62, wherein said accumulator comprises a plurality of 
M-bit elements and wherein said means for executing is repeated for said plurality of elements 
of said first and second vectors to produce a plurality of resulting elements that are received by 
said accumulator and wherein said means for transforming and said means for writing are 
performed on said plurality of resulting elements. 

64. The processor as recited in claim 63, wherein means for writing comprises: 

selecting a subset of said resulting elements in said accumulator for writing to said 
register, said subset being chosen from any one of: the low third bits, the middle third bits, and 
the high third bits of said resulting elements in said accumulator. 

65. The processor as recited in claim 63, wherein M is equal to three times N. 

66. The processor as recited in claim 65, wherein N is equal to eight or sixteen. 

67. The system as recited in claim 63, wherein said resulting elements in said accumulator 
are wrapped around the representable range of said resulting elements. 

68. The system as recited in claim 63, further comprising: 



dividing said resulting elements stored in said accumulator into a plurality of 



subsets; 



writing each subset to at least one of a plurality of registers, each of said plurality 
of registers having a width smaller than said accumulator width. 



69. The system as recited in claim 62, further comprising: 

means for formatting said resulting element in said accumulator as specified in 
said arithmetic instruction. 
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1 70. The processor as recited in claim 62, wherein said arithmetic instruction is any one of: 

2 addition, multiplication and subtraction. 

1 71. The processor as recited in claim 62, wherein means for transforming comprises: 

2 means for shifting said resulting element in said accumulator for scaling the value 

3 of said resulting element; 

4 means for rounding said resulting element; and 

5 means for clamping said resulting element. 

1 72. The processor as recited in claim 71, wherein said rounding means comprises one of: 

2 means for rounding said resulting element towards zero; 

3 means for rounding said resulting element towards the nearest unit, wherein said 

4 resulting element is rounded away from zero if said resulting element is at least halfway towards 

5 the nearest unit; and 

6 means for rounding said resulting element towards the nearest unit, wherein-said 

7 resulting element is rounded towards zero if said resulting element is at least halfway towards 

8 the nearest unit. 

1 73. The processor as recited in claim 62, further comprising: 

2 means for adding an element previously stored in said accumulator to said 

3 resulting element, upon reception of said resulting element by said accumulator. 

1 74. The processor as recited in claim 62, wherein N is any one of: eight, sixteen, thirty-two 

2 and sixty-four. 

1 75. The processor as recited in claim 74, wherein said N-bit elements are integers 

1 76. The processor as recited in claim 74, wherein each of said first and said second vectors 

2 has a width of 64 bits. 

1 77. The processor as recited in claim 76, wherein said accumulator is a register having a 

2 width equal to an integer multiple of 64 bits. 
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1 



78. The processor as recited in claim 77, wherein said accumulator is a register having a 
width of 192 bits.- 



2 



Remarks 



Upon entry of the foregoing amendment, claims 41-78 are pending in the application. 



to introduce no new matter, and their entry is respectfully requested. 

The Examiner has rejected claim 1 under 35 U.S.C. § 101 for double patenting in view 
of U.S. Patent No. 5,864,703. Applicants have canceled claim 1. Thus, this rejection is now 
moot. By the foregoing, Applicants seek to add new claims 41-78. Favorable consideration and 
allowance of these new claims is respectfully solicited. 

The Examiner is invited to telephone the undersigned representative if he believes that 
an interview might be useful for any reason. ~~ 



This amendment seeks to cancel claim 1 and add new claims 4 1 -78 . These changes are believed 



Respectfully submitted, 



Sterne, Kessler, Goldstein & Fox p.l.l.c. 




Michael B. Kay 
Attorney for Applicants 
Registration No. 33,997 



Date: 8/fi/OO 

1 100 New York Ave, N.W., Suite 600 
Washington, DC 20005 
(202) 371-2600 
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IN THE UNITED STATES PATENT AND TRADEMARK OFFICE 



In re application of: 



Hsu et al 



Art Unit: 2154 



Appl.No. 09/223,046 

Filed: December 30, 1998 

For: Method for Providing Extended 



Examiner: Donaghue,. L . 
Atty. Docket: 0056. 10US 



Batch No. R88 



Precision In SIMD Vector 



Arithmetic Operations 



Amendment Under 37 C.F.R. § 1.312 



Attn: Box Issue Fee 



Commissioner for Patents 
Washington, D.C. 20231 

Sir: 

Submitted herein is an Amendment Under 37 C.F.R. § 1 .3 12. As payment of the issue 
fee has not yet been made or is filed herewith, Applicants respectfully submit that filing under 
paragraph (a) of 37 C.F.R. § 1.312 is proper. (M.P.E.P. § 714.16.) 

It is believed that extensions of time are not required beyond those that may otherwise 
be provided for in documents accompanying this Amendment. However, if additional extensions 
of time are necessary to prevent abandonment of this application, then such extensions of time 
are hereby petitioned under 37 C.F.R. § 1.136(a), and any fees required therefor are hereby 
authorized to be charged to our Deposit Account No. 19-0036. 

If the Examiner believes, for any reason, that personal communication will the 
expedite acceptance of this Amendment, the Examiner is invited to telephone the 
undersigned at the number provided. 



-2- 



Hsu et aL 



Appl. No. 09/223,046 



Amendment 



Please enter the following Amendment: 



In the Drawings: 



Please amend FIG. 2 as shown in red. 



Remarks 



Applicants have noticed that within computer system 212 of FIG. 2, an element number 
"204" is erroneously present. Accordingly, Applicants are now submitting a new FIG. 2 without 
the incorrect element number. FIG. 2 is described in the Brief Description of the Drawings on 
page 6 of the specification and in the Detailed Description of Preferred Embodiments on pages 
7 and 8 of the specification. Applicants assert that the corrected FIG. 2 does not constitute new 
matter because this drawing is clearly consistent with the description of FIG. 2 in the application 
as originally filed. Entry of the above amendment is respectfully requested. 

Applicants have concurrently submitted a Request to Approve Proposed Drawing 
Corrections and one sheet of drawings containing the proposed correction to original FIG. 2 5 
shown in red ink. The proposed changes add no new matter to this application. 



1 100 New York Avenue, N.W., Suite 600 
Washington, D.C. 20005-3934 
(202)371-2600 
MBR/MPT/agj 
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Respectfully submitted, 



Sterne, Kessler, Goldstein & Fox p.l.l.c. 




Attorney for Applicants 
Registration No. 33,997 




IN THE UNITED STATES PATENT AND TRADEMARK OFFICE 



In re application of: 



Hsu et al 



Art Unit: 2154 



Appl.No. 09/223,046 

Filed: December 30, 1998 

For: Method for Providing Extended 



Examiner: Donaghue, L. 
Atty. Docket: 0056. 10US 



Batch No. R88 



Precision In SIMD Vector 
Arithmetic Operations 

Request to Approve Proposed Drawing Corrections 

Commissioner for Patents 
Washington, D.C. 20231 



Attached is a copy of 1 drawing sheet, containing a proposed correction to Figure 2, 
shown in red. The proposed change adds no new matter to this application. Applicants 
requests that the Examiner approve the proposed correction. Also submitted herewith are 
Formal Drawings, which correspond to the changes noted on the attached request. 



Sir: 



Respectfully submitted, 



Sterne, Kessler, Goldstein & Fox p.l.l.c. 




"\ Michael B. Ray 
L Attorney for Applicants 
Registration No. 33,997 




1 100 New York Avenue, N.W., Suite 600 
Washington, D.C. 20005-3934 
(202)371-2600 
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IN THE UNITED STATES PATENT AND TRADEMARK OFFICE 



In re application of: 
Timothy van Hook et al. 
AppLNo. 09/233,046 



Confirmation No.: 2296 



Art Unit: 2154 



Filed: December 30, 1998 



Examiner: L. Donaghue 
Atty. Docket: 0056. 10US 



For: Methods for Providing Extended 
Precision in SIMD Vector 
Arithmetic Operations 



Amendment and Reply under 37 C.F.R. § 1.111 



Commissioner for Patents 
Washington, D.C. 20231 

Sir: 

In reply to the Office Action dated March 15, 2002, (PTO Prosecution File Wrapper 
Paper No. 39), Applicants submit the following Amendment and Remarks. This 
Amendment is provided in the following format: 

(A) A clean version of each replacement paragraph/section/claim along 
with clear instructions for entry; 

(B) Starting on a separate page, appropriate remarks and arguments. 37 
C.F.R. § 1.121 and MPEP 714; and 

(G) Starting on a separate page, a marked-up version entitled: " Version 
with markings to show changes made. " 

It is not believed that extensions of time or fees for net addition of claims are 
required beyond those that may otherwise be provided for in documents accompanying this 
paper. However, if additional extensions of time are necessary to prevent abandonment of 
this application, then such extensions of time are hereby petitioned under 37 C.F.R. 
§ 1.136(a), and any fees required therefor (including fees for net addition of claims) are 
hereby authorized to be charged to our Deposit Account No. 19-0036. 
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Amendments 

In the Claims: 

Please cancel claims 41-46, 48-52, 54-66, 68-71 and 73-78 without prejudice or 
disclaimer. 

Please substitute the following claims 47 and 67 for the pending claims 47 and 67: 

47. (Once Amended) A computer-based method for providing extended precision in single 
instruction multiple data (SIMD) arithmetic operations, comprising the steps of: 

(a) loading a first vector into a first register, said first vector comprising a 
plurality of N-bit elements; 

(b) loading a second vector into a second register, said second vector comprising 
a plurality of N-bit elements; 

(c) executing an arithmetic instruction for at least one pair consisting of an N-bit 
element in said first register and an N-bit element in said second register, to produce a 
resulting element; 

(d) writing said resulting element into an M-bit element of an accumulator, 
wherein M is greater than N; 

(e) transforming said resulting element in said accumulator into a width of N- 
bits; and 

(f) writing said resulting element into a third register; 

wherein said accumulator comprises a plurality of M-bit elements and wherein steps 
(c)-(f) operate on a plurality of elements of said first and second vectors to produce a 
resultant vector formed from a plurality of resulting elements written to said third register; 
and 

wherein said resulting elements in said accumulator are wrapped around the 
representable range of said resulting elements. 
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53. (Once Amended) The method as recited in claim 80, wherein said rounding step 
comprises one of: 

rounding said resulting element towards zero; 

rounding said resulting element towards the nearest unit, wherein said resulting 
element is rounded away from zero if said resulting element is at least halfway towards the 
nearest unit; and 

rounding said resulting element towards the nearest unit, wherein said resulting 
element is rounded towards zero if said resulting element is at least halfway towards the 
nearest unit. 

67. (Once Amended) A processor for providing extended precision in single instruction 
multiple data (SIMD) arithmetic operations, comprising: 

means for executing an arithmetic instruction involving an element of a first vector 
and an element of a second vector to produce a resulting element, said first and second 
vector comprising a plurality of N-bit elements; 

an accumulator for receiving said resulting element, wherein said resulting element 
is stored in an M-bit element of said accumulator and wherein M is greater than N; 

means for transforming said resulting element in said accumulator into a width of N- 
bits; and 

means for writing said transformed resulting element to a register; 

wherein said accumulator comprises a plurality of M-bit elements and wherein said 
means for executing is repeated for said plurality of elements of said first and second vectors 
to produce a plurality of resulting elements that are received by said accumulator and 
wherein said means for transforming and said means for writing are performed on said 
plurality of resulting elements; and 

wherein said resulting elements in said accumulator are wrapped around the 
representable range of said resulting elements. 

72. The processor as recited in claim 82, wherein said rounding means comprises one 
of: 

means for rounding said resulting element towards zero; 
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means for rounding said resulting element towards the nearest unit, wherein said 
resulting element is rounded away from zero if said resulting element is at least halfway 
towards the nearest unit; and 

means for rounding said resulting element towards the nearest unit, wherein said 
resulting element is rounded towards zero if said resulting element is at least halfway 
towards the nearest unit. 

Please add the following new claims 79-82: 

79. A computer-based method for providing extended precision in single instruction 
multiple data (SIMD) arithmetic operations, comprising the steps of: 

(a) loading a first vector into a first register, said first vector comprising a 
plurality of N-bit elements; 

(b) loading a second vector into a second register, said second vector comprising 
a plurality of N-bit elements; 

(c) executing an arithmetic instruction for at least one pair consisting of an N-bit 
element in said first register and an N-bit element in said second register, to produce a 
resulting element; 

(d) writing said resulting element into an M-bit element of an accumulator, 
wherein M is greater than N; 

(e) transforming said resulting element in said accumulator into a width of N- 

bits; 

(f) dividing said resulting elements stored in said accumulator into a plurality 
of subsets; 

(g) writing each subset to at least one of a plurality of registers, each of said 
plurality of registers having a width smaller than said accumulator width; and 

(h) writing said resulting element into a third register; 

wherein said accumulator comprises a plurality of M-bit elements and wherein steps 
(c)-(h) operate on a plurality of elements of said first and second vectors to produce a 
resultant vector formed from a plurality of resulting elements written to said third register. 
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80. A computer-based method for providing extended precision in single instruction 
multiple data (SIMD) arithmetic operations, comprising the steps of: 

(a) loading a first vector into a first register, said first vector comprising a 
plurality of N-bit elements; 

(b) loading a second vector into a second register, said second vector comprising 
a plurality of N-bit elements; 

(c) executing an arithmetic instruction for at least one pair consisting of an N-bit 
element in said first register and an N-bit element in said second register, to produce a 
resulting element; 

(d) writing said resulting element into an M-bit element of an accumulator, 
wherein M is greater than N; 

(e) transforming said resulting element in said accumulator into a width of N- 
bits, wherein said transforming comprises shifting said resulting element in said accumulator 
for scaling the value of said resulting element, rounding said resulting element and clamping 
said resulting element; and 

(f) writing said resulting element into a third register. 

81. A processor for providing extended precision in single instruction multiple data 
(SIMD) arithmetic operations, comprising: 

means for executing an arithmetic instruction involving a first plurality of elements 
of a first vector and a second plurality of elements of a second vector to produce a plurality 
of resulting elements, said first and second vector comprising a plurality of N-bit elements; 

an accumulator for receiving said plurality of resulting elements, wherein said 
plurality of resulting elements are each stored in one of a plurality of M-bit elements of said 
accumulator and wherein M is greater than N; 

means for transforming said plurality of resulting elements in said accumulator into 
a width of N-bits; 

means for dividing said plurality of resulting elements stored in said accumulator into 
a plurality of subsets; and 

means for writing each subset to at least one of a plurality of registers, each of said 
plurality of registers having a width smaller than said accumulator width. 



- 6 - Timothy van Hook et al. 

Appl. No. 09/233,046 

82. A processor for providing extended precision in single instruction multiple data 
(SIMD) arithmetic operations, comprising: 

means for executing an arithmetic instruction involving an element of a first vector 
and an element of a second vector to produce a resulting element, said first and second 
vector comprising a plurality of N-bit elements; 

an accumulator for receiving said resulting element, wherein said resulting element 
is stored in an M-bit element of said accumulator and wherein M is greater than N; 

means for transforming said resulting element in said accumulator into a width of N- 
bits, wherein said means for transforming comprises means for shifting said resulting 
element in said accumulator for scaling the value of said resulting element, means for 
rounding said resulting element, and means for clamping said resulting element; and 

means for writing said transformed resulting element to a register. 
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Remarks 

Reconsideration of this application is respectfully requested. 

Upon entry of the foregoing amendment, claims 47, 53, 67, 72 and 79-82 are pending 
in the application, with 47, 67 and 79-82 being the independent claims. Claims 41-46, 48- 
52, 54-66, 68-71 and 73-78 are sought to be cancelled without prejudice to or disclaimer of 
the subject matter therein. Claims 47, 53, 67 and 72 have been amended. New independent 
claims 79-82 have been added to replace allowable dependent claims 48, 52, 68 and 71, 
respectively, including the features of their respective base claims. These changes are 
believed to introduce no new matter, and their entry is respectfully requested. 

Based on the above amendment and the following remarks, Applicants respectfully 
request that the Examiner reconsider all outstanding objections and rejections and that they 
be withdrawn. 

Examiner Interview 

Applicants and Applicants 1 representative wish to thank Examiner Donaghue for 
conducting the personal interview with Applicants 1 undersigned representative on January 
24, 2002. The Examiner Interview Summary Record accurately reflects the substance of the 
interview. 

Rejections and Amendments 

In the Office Action dated March 15, 2002, claims 47, 48, 52, 53, 67, 68, 71 and 72 
were "objected to as being dependent upon a rejected base claim, but would be allowable 
if rewritten in independent form including all of the limitations of the base claim and any 
intervening claims." Accordingly, by way of the above Amendments, Applicants have 
placed these claims in independent form. To expedite issuance of the allowed claims, the 
rejected claims are being cancelled without prejudice or disclaimer. Applicants reserve the 
right and hereby give notice of their intent to pursue those claims and traverse the rejection 
in a continuation application, which will be filed in due course. 
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In conclusion, Applicants respectfully request that the allowed claims be passed to 
issue and that the rejections be withdrawn as moot in light of the cancellation of the rejected 
claims. 

Conclusion 

All of the stated grounds of objection and rejection have been properly traversed, 
accommodated, or rendered moot. Applicants therefore respectfully request that the 
Examiner reconsider all presently outstanding objections and rejections and that they be 
withdrawn. Applicants believe that a full and complete reply has been made to the 
outstanding Office Action and, as such, the present application is in condition for allowance. 
If the Examiner believes, for any reason, that personal communication will expedite 
prosecution of this application, the Examiner is invited to telephone the undersigned at the 
number provided. 

Prompt and favorable consideration of this Amendment and Reply is respectfully 
requested. 

Respectfully submitted, 



Date: &ltH^ 

1 100 New York Avenue, N. W. 
Suite 600 

Washington, D.C. 20005-3934 

(202) 371-2600 

DJF/mmb 

SKGF_DC1:21053.5 



Sterne, Kessler, Goldstein & Fox p.l.l.c. 




Donald J. Featherstone 
Attorney for Applicants 
Registration No. 33,876 



SKGFRev. 4/9/02 
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Version with markings to show changes made 

Claims 41-46, 48-52, 54-66, 68-71 and 73-78 have been cancelled 

47. (Once Amended) [The method as recited in claim 42,] A computer-based method for 
providing extended precision in single instruction multiple data (SIMOOM arithmetic 
operations, comprising the steps of: 

(a) loading a first vector into a first register, said first vector comprising a 
plurality of N-bit elements: 

fl>) loading a second vector into a second register, said second vector comprising 
a plurality of N-bit elements: 

(c) executing an arithmetic instruction for at least one pair consisting of an N-bit 
element in said first register and an N-bit element in said second register, to produce a 
resulting element: 

£d) writing said resulting element into an M-bit element of an accumulator, 
wherein M is greater than N: 

£e} transforming said resulting element in said accumulator into a width of N- 
bits: and 

(f) writing said resulting element into a third register: 

wherein said accumulator comprises a plurality of M-bit elements and wherein steps 
(cWf) operate on a plurality of elements of said first and second vectors to produce a 
resultant vector formed from a plurality of resulting elements written to said third register: 
and 

wherein said resulting elements in said accumulator are wrapped around the 
representable range of said resulting elements. 

53. (Once Amended) The method as recited in claim [52] 80, wherein said rounding step 
comprises one of: 

rounding said resulting element towards zero; 
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rounding said resulting element towards the nearest unit, wherein said resulting 
element is rounded away from zero if said resulting element is at least halfway towards the 
nearest unit; and 

rounding said resulting element towards the nearest unit, wherein said resulting 
element is rounded towards zero if said resulting element is at least halfway towards the 
nearest unit. 

67. (Once Amended) [The system as recited in claim 63,] A processor for providing 
extended precision in single instruction multiple data (SIMP) arithmetic operations, 
comprising: 

means for executing an arithmetic instruction involving an element of a first vector 
and an element of a second vector to produce a resulting element, said first and second 
vector comprising a plurality of N-bit elements: 

an accumulator for receiving said resulting element, wherein said resulting element 
is stored in an M-bit element of said accumulator and wherein M is greater than N: 

means for transforming said resulting element in said accumulator into a width of 
N-bits: and 

means for writing said transformed resulting element to a register: 
wherein said accumulator comprises a plurality of M-bit elements and wherein said 
means for executing is repeated for said plurality of elements of said first and second vectors 
to produce a plurality of resulting elements that are received by said accumulator and 
wherein said means for transforming and said means for writing are performed on said 
plurality of resulting elements: and 

wherein said resulting elements in said accumulator are wrapped around the 
representable range of said resulting elements. 

72. (Once Amended) The processor as recited in claim [71] 82, wherein said rounding 
means comprises one of: 

means for rounding said resulting element towards zero; 
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means for rounding said resulting element towards the nearest unit, wherein said 
resulting element is rounded away from zero if said resulting element is at least halfway 
towards the nearest unit; and 

means for rounding said resulting element towards the nearest unit, wherein said 
resulting element is rounded towards zero if said resulting element is at least halfway 
towards the nearest unit. 



New claims 79-82 have been added. 
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IN THE UNITED STATES PATENT AND TRADEMARK OFFICE 



In re application of: 



Confirmation No.: 



2296 



Van Hook et aL 



Art Unit: 



2154 



Appl.No.: 09/223,046 

Filed: December 30, 1998 

For: Method for Providing Extended 



Atty. Docket: 



Examiner: 



Donaghue, Larry D. 
1778.0110001 



Precision in SIMD Vector 



Arithmetic Operations 



Amendment Under 37 C.F.R § 1.114 



Commissioner for Patents 

P.O. Box 1450 

Alexandria, VA 22313-1450 

Sir: 

Following the filing of a Request for Continued Examiner under 37 C.F.R. § 1 .1 14 on 
December 30, 2003, Applicants submit herewith the following Amendment and Remarks. 
This Amendment is provided in the following format: 

(A) Each section begins on a separate sheet; 

(B) Starting on a separate sheet, amendments to the specification by presenting 
replacement paragraphs marked up to show changes made; 

(C) Starting on a separate sheet, a complete listing of all of the claims: 

- in ascending order; 

- with status identifiers; and 

- with markings in the currently amended claims; 

(D) Starting on a separate sheet, the Remarks. 

It is not believed that extensions of time or fees for net addition of claims are required 
beyond those that may otherwise be provided for in documents accompanying this paper. 
However, if additional extensions of time are necessary to prevent abandonment of this 
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application, then such extensions of time are hereby petitioned under 37 C.F.R. § 1.136(a), 
and any fees required therefore (including fees for net addition of claims) are hereby 
authorized to be charged to our Deposit Account No. 19-0036. 
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Amendments to the Specification 



Please amend the specification as indicated. 



Please amend the paragraph starting on page 7, line 22, as follows: 

FIG. 2 illustrates an exemplary computer system 212 comprised of a system bus 200 
for communicating information, one or more central processors 201 coupled with the bus 200 
for processing information and instructions, a computer readable volatile memory unit 202 
(e.g., random access memory, static RAM, dynamic RAM, etc.) coupled with the bus 200 for 
storing information and instructions for the central processor(s) 201, a computer readable 
non-volatile memory unit 203 (e.g., read only memory, programmable ROM, flash memory, 
EPROM, EEPROM, etc.) coupled with the bus 200 for storing static information and 
instructions for the processor(s). 

Please amend the paragraph starting on page 9, line 1 1, as follows: 

The vector register file is comprised of 32 64-bit general purpose registers 306 
through 310. The general purpose registers 306 through 310 are visible to the programmer 
and can be used to store intermediate results. The preferred embodiment of the present 
invention uses the floating point registers (FGR) (PPR) of a floating point unit (FPU) as its 
vector registers. 
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Please amend the paragraph starting on page 13, line 26, as follows: 

Integer vector operations that write to the FPRs clamp the values being written to the 
target's representable range. That is, the elements are saturated for overflows and under flows 
underflows. For overflows, the values are clamped to the largest representable value. For 
underflows, the values are clamped to the smallest representable value. 

Please amend the paragraph starting on page 17, line 19, as follows: 

Load Vector Add (ADDL.fmt). According to the ADDL.fmt instruction, the 
corresponding elements in vectors vt and vs are added and then stored into corresponding 
elements in the accumulator. Any overflows or underflows in the elements wrap around the 
accumulator's representable range and then are written into the accumulator 206 806 . 

Please amend the paragraph starting on page 22, line 1, as follows: 

A RACL/RACM/RACH instruction followed by WACL/WACH are used to save and 
restore the accumulator. This save/ r o ator o save/restore function is format independent, either 
format can be used to save or restore accumulator values generated by either QH or OB 
operations. Data conversion need not occur. The mapping between element bits of the OB 
format accumulator and bits of the same accumulator interpreted in QH format is 
implementation specific, but consistent for each implementation. 
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Amendments to the Drawings 

Submitted herewith is a replacement drawing sheet for Figure 2, corresponding to the 
annotated drawing sheet also submitted herewith. Specifically, element reference numeral 
200 was added to Figure 2 to be consistent with the specification as originally filed. 
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Amendments to the Claims 



Applicants submit no amendments to the claims. 
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Remarks 



\ 



Claims 47, 53, 67, 72, and 79-82 are pending in the application, with claims 47, 67, 
and 79-82 being the independent claims. 

This Amendment corrects formal matters without changing the scope of the claims. 
Specifically, drawing reference numerals were added or corrected in the specification that 
correspond to the drawings as originally filed, and amendments were made to correct minor 
informalities throughout the specification. In addition, Figure 2 was amended to include 
reference numeral 200 to be consistent with the specification as originally filed. Both an 
annotated drawing sheet and a replacement drawing sheet for Figure 2 are submitted 
herewith. None of the amendments add new matter. Accordingly, Applicants respectfully 
request that this Amendment be entered. Prompt and favorable consideration of this 
Amendment is respectfully requested. 



Respectfully submitted, 



Sterne, Kessler. Goldstein & Fox p.l.l.c. 





Donald J.^eatnerstone 
Attorney for Applicants 
Registration No. 33,876 



1 100 New York Avenue, N.W. 
Washington, D.C. 20005-3934 
(202) 371-2600 
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Exhibit I 



Pending Claims 
U.S. Patent Application No. 09/223,046 

47. A computer-based method for providing extended precision in single instruction multiple 
data (SIMD) arithmetic operations, comprising the steps of: 

(a) loading a first vector into a first register, said first vector comprising a plurality of 
N-bit elements; 

(b) loading a second vector into a second register, said second vector comprising a 
plurality of N-bit elements; 

(c) executing an arithmetic instruction for at least one pair consisting of an N-bit 
element in said first register and an N-bit element in said second register, to produce a resulting 
element; 

(d) writing said resulting element into an M-bit element of an accumulator, wherein 
M is greater than N; 

(e) transforming said resulting element in said accumulator into a width of N-bits; 

and 

(f) writing said resulting element into a third register; 

wherein said accumulator comprises a plurality of M-bit elements and wherein steps 
(c)-(f) operate on a plurality of elements of said first and second vectors to produce a resultant 
vector formed from a plurality of resulting elements written to said third register; and 

wherein said resulting elements in said accumulator are wrapped around the representable 
range of said resulting elements. 

53. The method as recited in claim 80, wherein said rounding step comprises one of: 
rounding said resulting element towards zero; 




rounding said resulting element towards the nearest unit, wherein said resulting element 
is rounded away from zero if said resulting element is at least halfway towards the nearest unit; 
and 

rounding said resulting element towards the nearest unit, wherein said resulting element 
is rounded towards zero if said resulting element is at least halfway towards the nearest unit. 

67. A processor for providing extended precision in single instruction multiple data (SIMD) 
arithmetic operations, comprising: 

means for executing an arithmetic instruction involving an element of a first vector and 
an element of a second vector to produce a resulting element, said first and second vector 
comprising a plurality of N-bit elements; 

an accumulator for receiving said resulting element, wherein said resulting element is 
stored in an M-bit element of said accumulator and wherein M is greater than N; 

means for transforming said resulting element in said accumulator into a width of N-bits; 

and 

means for writing said transformed resulting element to a register; 

wherein said accumulator comprises a plurality of M-bit elements and wherein said 
means for executing is repeated for said plurality of elements of said first and second vectors to 
produce a plurality of resulting elements that are received by said accumulator and wherein said 
means for transforming and said means for writing are performed on said plurality of resulting 
elements; and 

wherein said resulting elements in said accumulator are wrapped around the representable 
range of said resulting elements. 
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72. The processor as recited in claim 82, wherein said rounding means comprises one of: 
means for rounding said resulting element towards zero; 

means for rounding said resulting element towards the nearest unit, wherein said resulting 
element is rounded away from zero if said resulting element is at least halfway towards the 
nearest unit; and 

means for rounding said resulting element towards the nearest unit, wherein said resulting 
element is rounded towards zero if said resulting element is at least halfway towards the nearest 
unit. 

79. A computer-based method for providing extended precision in single instruction multiple 
data (SIMD) arithmetic operations, comprising the steps of: 

(a) loading a first vector into a first register, said first vector comprising a plurality of 
N-bit elements; 

(b) loading a second vector into a second register, said second vector comprising a 
plurality of N-bit elements; 

(c) executing an arithmetic instruction for at least one pair consisting of an N-bit 
element in said first register and an N-bit element in said second register, to produce a resulting 
element; 

(d) writing said resulting element into an M-bit element of an accumulator, wherein 
M is greater than N; 

(e) transforming said resulting element in said accumulator into a width of N-bits; 

(f) dividing said resulting elements stored in said accumulator into a plurality of 
subsets; 
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(g) writing each subset to at least one of a plurality of registers, each of said plurality 
of registers having a width smaller than said accumulator width; and 

(h) writing said resulting element into a third register; 

wherein said accumulator comprises a plurality of M-bit elements and wherein steps 
(c)-(h) operate on a plurality of elements of said first and second vectors to produce a resultant 
vector formed from a plurality of resulting elements written to said third register. 

80. A computer-based method for providing extended precision in single instruction multiple 
data (SIMD) arithmetic operations, comprising the steps of: 

(a) loading a first vector into a first register, said first vector comprising a plurality of 
N-bit elements; 

(b) loading a second vector into a second register, said second vector comprising a 
plurality of N-bit elements; 

(c) executing an arithmetic instruction for at least one pair consisting of an N-bit 
element in said first register and an N-bit element in said second register, to produce a resulting 
element; 

(d) writing said resulting element into an M-bit element of an accumulator, wherein 
M is greater than N; 

(e) transforming said resulting element in said accumulator into a width of N-bits, 
wherein said transforming comprises shifting said resulting element in said accumulator for 
scaling the value of said resulting element, rounding said resulting element and clamping said 
resulting element; and 

(f) writing said resulting element into a third register. 
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81. A processor for providing extended precision in single instruction multiple data (SIMD) 
arithmetic operations, comprising: 

means for executing an arithmetic instruction involving a first plurality of elements of a 
first vector and a second plurality of elements of a second vector to produce a plurality of 
resulting elements, said first and second vector comprising a plurality of N-bit elements; 

an accumulator for receiving said plurality of resulting elements, wherein said plurality of 
resulting elements are each stored in one of a plurality of M-bit elements of said accumulator and 
wherein M is greater than N; 

means for transforming said plurality of resulting elements in said accumulator into a 
width of N-bits; 

means for dividing said plurality of resulting elements stored in said accumulator into a 
plurality of subsets; and 

means for writing each subset to at least one of a plurality of registers, each of said 
plurality of registers having a width smaller than said accumulator width. 

82. A processor for providing extended precision in single instruction multiple data (SIMD) 
arithmetic operations, comprising: 

means for executing an arithmetic instruction involving an element of a first vector and 
an element of a second vector to produce a resulting element, said first and second vector 
comprising a plurality of N-bit elements; 

an accumulator for receiving said resulting element, wherein said resulting element is 
stored in an M-bit element of said accumulator and wherein M is greater than N; 
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means for transforming said resulting element in said accumulator into a width of N-bits, 
wherein said means for transforming comprises means for shifting said resulting element in said 
accumulator for scaling the value of said resulting element, means for rounding said resulting 
element, and means for clamping said resulting element; and 

means for writing said transformed resulting element to a register. 
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Exhibit J 



Supplemental Declaration for Patent Application 



Docket Number: 1778.0110001 

As a below named inventor, I hereby declare that: 

My residence, mailing address and citizenship are as stated below next to my name. 

I believe I am an original, first and joint inventor of the subject matter that is claimed and for which a patent is 
sought on the invention entitled Method for Providing Extended Precision in SIMD Vector Arithmetic 
Operations, the specification of which is attached hereto unless the following box is checked: 

^ was filed on December 30, 1998; 

as United States Application Number 09/223,046; and 

was amended on August 1 1, 2000; December 5, 2000; June 14, 2002; and October 13, 2004. 

I hereby state that I have reviewed and understand the contents of the above identified specification, including 
the claims, as amended by any amendment referred to above. 

I acknowledge the duty to disclose information that is material to patentability as defined in 37 C.F.R. § 1.56, 
including for continuation-in-part applications, material information which became available between the filing 
date of the prior application and the national or PCT international filing date of the continuation-in-part 
application. 

I hereby claim foreign priority benefits under 35 U.S.C. § 1 19(a)-(d) or (f) or § 365(b) of any foreign 
application(s) for patent, inventor's or plant breeder's rights certificate(s), or § 365(a) of any PCT international 
application, which designated at least one country other than the United States of America, listed below, and 
have also identified below, by checking the box, any foreign application for patent, inventor's or plant breeder's 
rights certificate^), or PCT international application having a filing date before that of the application on which 
priority is claimed. 

Prior Foreign Applications(s): Priority Claimed 

DYes DNo 

(Application No.) (Country) (Day/Month/Year 

Filed) 

□ Yes □ No 

(Application No.) (Country) (Day/Month/Year 

Filed) 

Send Correspondence to: Customer No. 26 1 1 1 

Sterne, Kessler, Goldstein & Fox P.L.L.C. 
1 100 New York Avenue, N.W. 
Washington, D.C. 20005-3934 

Direct Telephone Calls to: (202) 371-2600 

I hereby declare that all statements made herein of my own knowledge are true and that all statements made on 
information and belief are believed to be true; and further that these statements were made with the knowledge 
that willful false statements and the like so made are punishable by fine or imprisonment, or both, under 18 
U.S.C. § 1001 and that such willful false statements may jeopardize the validity of the application or any patent 
issued thereon. 
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Appl No. 09/223,046 
Docket No. 1778.0110001 



Full Name of First Inventor: 


Tunothy J. Van Hook 




oignauire oi rirsi invemor. 




Date: 


Residence: 


Atherton, California 




Citizenship: 


U.S.A. 




Mailing Address: 


224 Oakgrove Avenue 
Atherton, CA 94027 
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Don Featherstone - FedEx Shipment 792905387783 Delivered 



From: Notifications@fedex.com 

To: <donf@skgf.com> 

Date: 4/28/2005 3:30 PM 

Subject: FedEx Shipment 792905387783 Delivered 



Our records indicate that the following shipment has been delivered: 



Tracking number: 
Door Tag number: 
Reference: 
Ship (P/U) date: 
Delivery date: 
Signed for by: 
Service type: 
Packaging type: 
Number of pieces: 
Weight: 



792905387783 
DT100682224252 
1778.0110001 
Apr 25, 2005 
Apr 28, 2005 12:27 PM 
AJALPA 

FedEx Standard Overnight 
FedEx Box 
1 

1.0 LB 



Shipper Information Recipient Information 

Donald J. Featherstone Timothy J. Van Hook 

Sterne Kessler Goldstein & Fox 224 Oakgrove Avenue 
1100 New York Avenue, NW Atherton 
Washington CA 
DC US 
US 94027 
20005 



Special handling/Services 
Deliver Weekday 

Please do not respond to this message. This email was sent from an unattended 
mailbox. This report was generated at approximately 2:29 PM CDT on 04/28/2005. 

For questions about FedEx Express, please call us at 1.800.Go.FedEx. 



All weights are estimated. 



To track the status of this shipment online, please use the following: 
http://www.fedex.comyTracking? 

tracknumbers=792905387783&action=track&lanquage=enqlish&cntry code=us&clienttype=iv podalrt 
Thank you for your business. 
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claration and Power of Attorney 
for a Patent Application 




Decfafation 

As below named inventor, I hereby declare that my residence post office address, and citizenship are as stated 
below my name. Further, I hereby declare that I believe I am the original, first and sole inventor (if only one name is 
listed below) or an original, first and joint inventor (if plural names are listed below) of the subject matter which is 
claimed and for which a patent is sought on the invention entitled: 

A METHOD FOR PROVIDING EXTENDED PRECISION IN SIMD VECTOR ARITHMETIC OPERATIONS 

the specification of which: 
is attached hereto, or 

...x was fited on .^0/9/9 7 as application serial no. 08/9 47 , 648 : and 



I hereby state that I have reviewed and understand. the contents of the above identified specification, including 
the claims, as amended by any amendment referred to above; and 

I acknowledge the duty to disclose information which is material to the examination of this application in 
accordance with Title 37, Code of Federal Regulations, Section 1 .56(a). 



Foreign Priority Claim 

I hereby claim foreign priority benefits under Title 35, United States Code Section 1 19 of any foreign application(s) 
for patent or inventor's certificate listed below and have also identified below any foreign application for patent or 
inventor's certificate having a filing date before that of the application on which priority is claimed: 

Number Country Date Filed Priority Claimed 



U.S. Priority Claim 

I hereby claim the benefit under Title 35, United States Code, Section 120 of any United States application(s) 
listed below and, insofar as the subject matter of each of the claims of this application is not disclosed in the prior 
United States application in the manner provided by the first paragraph of Title 35, United States Code, Section 
112, 1 acknowledge the duty to disclose material information as defined in Title 37, Code of Federal Regulations, 
Section 1.56(a) which occurred between the. filing date of the prior application and the national or PCT 
international filing date of this application: 

Serial Number Filing Date Status (patented/pending/abandoned) 



was amended on 



yes 



no 



yes 



no 
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; ) Attorney )cket No. :SGI 15-4-458.00 

: ) ' y 

Power of Attorney 

As a named inventor, I hereby appoint the following attorney(s) and/or agent(s) to prosecute this application and 
transact all business in the Patent Trademark Office connected therewith. 



„^SS}®§....?..:.„.^£ Registration No.: 3 6 ,398 

..^££££^ Registration No.: 3 5,295 



£9l2£.„L\. t . .Wagner Registration No.: 35 ,398 

.i?«l:£^..P«:.....?S£5.?.s Registration No.: p- 42, 293 

.MM?.S^..9.:....Lajm _ Registration No.: ,..p-.41 /m 923 m 

wginer Registration No.: 3 8,330 



.ir.&£1.5.. - Registration No.: 32 , 204 

.M?£SS±.f®£SS£4?i?. Registration No.: 34,625 



.^.?.^...,?.?Aa^E Registration No;: 40, 53 0 

Send Correspondence to: 



WAGNER, MURABITO & HAO 

Two North Market Street, Third Floor 
San Jose, California 95113 
(408) 938-9060 



Signatures 

I hereby declare that all statements made herein of my own knowledge are true and that all statements made on 
information and belief are believed to be true; and further that these statements were made with the knowledge 
that willful false statements and the like so made are punishable by fine or imprisonment, or both, under Section 
1001 of Title 18 of the United States Code and that such willful false statements may jeopardize the validity of the 
application or any patent issued thereon. 

Full Name of Sole/First Inventor: Timothy van Hook 



Inventor's Signature Oat e 
Residence _ At her ton, California Citizenship usa 

(City Stei'e) 

P.O. Address 22 4 _ Oakgroye m .Avenue , At 94027 



Full Name of Second/Joint Inventor: Pete 



Inventor's Signature Jzfcil^^ Date ^ 

Residence % Fremont , call f oxriCL " CltlzehVhip U. S. A 

(City Siaiej " " A 

P.O. Address ,JJ.53 M Welk M ^ California 94555 




Full Name of Third/Joint Inventor William a. Huffman 



Inventor's Signature Oate 

Residence Los citizenship usa 

(City Siaie) ~ 

P.O. Address A2l9.P^2^S?^ . Lane ' Los Gatos, California 95032 
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I J AttornefVket No.:SGI 15-4-458.00 

) ,) 

Full Name of Fourth/Joint Inventor: Henry; p . Moraton 

Inventor's Signature Date 
Residence Woods ide, California Citizenship USA 

(City State) 

P.O. Address h 40 m . Phi 1 1 ip_ Road , { „ M Woods ide fornia 94062-2625 



Full Name of Fifth/Joint Inventor: Earl A. Killian 



Inventor's Signature Date 

Residence ...Los^AitM ' citizenship usa 

(City Siate) 

P.O. Address Hills, California 94022 
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Attornef )ocket No.:SGI 15-4-458.00 

declaration and Power of Attorney 
for a Patent Application 



As below named inventor, I hereby declare that my residence post office address, and citizenship are as stated 
below my name. Further, I hereby declare that I believe I am the original, first and sole inventor (if only one name is 
listed below) or an original, first and joint inventor (if plural names are listed below) of the subject matter which is 
claimed and for which a patent is sought on the invention entitled: 

A METHOD FOR PROVIDING EXTENDED PRECISION IN SIMD VECTOR ARITHMETIC OPERATIONS 

the specification of which: 
is attached hereto, or 

... x was filed on ...10/9/97 t as application serial no. ...08/9.47., 54 8 : and 

was amended on 



I hereby state that I have reviewed and understand the contents of the above identified specification, including 
the claims, as amended by any amendment referred to above; and 

I acknowledge the duty to disclose information which is material to the examination of this application in 
accordance with Title 37, Code of Federal Regulations, Section 1 .56(a). 



Foreign Priority Claim 

I hereby claim foreign priority benefits under Title 35, United States Code Section 1 19 of any foreign application(s) 
for patent or inventor's certificate listed below and have also identified below any foreign application for patent or 
inventor's certificate having a filing date before that of the application on which priority is claimed: 

Number Country Date Filed Priority Claimed 

; yes no 

yes no 



U.S. Priority Claim 

I hereby claim the benefit under Title 35, United States Code, Section 120 of any United States application(s) 
listed below and, insofar as the subject matter of each of the claims of this application is not disclosed in the prior 
United States application in the manner provided by the first paragraph of Title 35, United States Code, Section 
112, 1 acknowledge the duty to disclose material information as defined in Title 37, Code of Federal Regulations, 
Section 1.56(a) which occurred between the filing date of the prior application and the national or PCT 
international filing date of this application: 

Serial Number * Filing Date Status (patented/pending/abandoned) 
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{ Attorney" v |cket No. :SG I 15-4-458.00 

> ) 

Power of Attorney 

As a named inventor, I hereby appoint the following attorney(s) and/or agent(s) to prosecute this application and 
transact all business in the Patent Trademark Office connected therewith. 

...£ a ™.??.:L p, : .„ Hao Registration No.: 36 , 398 t 

•^lh?3XS^..^}iL^hh!k9. Registration No.: ,. ( 35 /( 29 5 ( 

...£2.^L P...... Wagner Registration No.: 3 5 ,.39 8 

..91^R,.R:„.33J3.t^ Registration No.: jp- 42 Jt 2 9 3 

...yiil£^..H ( ; t ...Laxn t ; Registration No.: ...p-41,. t 923 M 

...§?jy±.wginer Registration No.: 3 8 ,„33 0 _ 

S.hLh?...M™e t: Registration No.: 32 ; 204 

..«T.£SS®. .ZS£Sa3^Si5. Registration No.: 34 , 62 5 

«.^S^....§.Lig.l?n Registration No.: 40 , 53 0 

Send Correspondence to: 

WAGNER, MURABITO & HAO 

Two North Market Street, Third Floor 
San Jose, California 95113 
(408) 938-9060 



Signatures 

I hereby declare that all statements made herein of my own knowledge are true and that all statements made on 
information and belief are believed to be true; and further that these statements were made with the knowledge 
that willful false statements and the like so made are punishable by fine or imprisonment, or both, under Section 
1001 of Title 18 of the United States Code and that such willful false statements may jeopardize the validity of the 
application or any patent issued thereon. 

Full Name of Sole/First Inventor: Timothy van Hook 



Inventor's Signature Oat e 

Residence ..ac.?J.?.?^.??.' California ' .Citizenship usa 

(City ' siaiej " 

P.O. Address ?4f?? 7 

Full Name of Second/Joint Inventor: Peter Hsu 



Inventor's Signature O a t e 

Residence t r cTtizehship v 

(City " State) ~ * 

P.O. Address JL§M..Wel^ 94555 

Full Name of Third/Joint Inventory jyiiiiam a. Huffman 

Inventor's Signature '^^^ Date M^tpS ^7S 

Residence ...Lgs.Gacos, California Citizenship usa * * 

(City Siaiej " 

P.O. Address 1620^ Gatos, California 95032 
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n 



Attorney' N jcket No.:SGI 15-4-458.00 
J 



Full Name of Fourth/Joint Inventor: , Henry p . Moreton 

Inventor's Signature Date 
Residence ...Woodside , t California Citizenship USA 

(City State) ' 
P.O. Address ...AAQ^P.!^ t ?4062 -2 62 5 

Full Name of Fifth/Joint Inventor: Earl a. Killian 



Inventor's Signature Date 

Residence ,.£os^ CaTifornia Uiiizenship usa 

"(City State) ' 

P.O. Address 27961 Central Drive, Los Altos Hills, California 94022 
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Attorney )cket No.:SGI 15-4-458.00 

Declaration and Power of Attorney 
for a Patent Application 



Decla 



As below named inventor, I hereby declare that my residence post office address, and citizenship are as stated 
below my name. Further, I hereby declare that I believe I am the original, first and sole inventor (if only one name is 
listed below) or an original, first and joint inventor (if plural names are listed below) of the subject matter which is 
claimed and for which a patent is sought on the invention entitled: 

K METHOD FOR PROVIDING EXTENDED PRECISION IN SIMD VECTOR ARITHMETIC OPERATIONS 

the specification of which: 



is attached hereto, or 

, x. was filed on ..10/9/97 as application serial no. 

was amended on 



08/947,648 



: and 



I hereby state that I have reviewed and understand the contents of the above identified specification, including 
the claims, as amended by any amendment referred to above; and 

I acknowledge the duty to disclose information which is material to the examination of this application in 
accordance with Title 37, Code of Federal Regulations, Section 1.56(a). 



Foreign Priority Claim 

I hereby claim foreign priority benefits under Title 35, United States Code Section 1 19 of any foreign application(s) 
for patent or inventor's certificate listed below and have also identified below any foreign application for patent or 
inventor's certificate having a filing date before that of the application on which priority is claimed: 

Number Country Date Filed Priority Claimed 



yes no 



yes no 



U.S. Priority Claim 

I hereby claim the benefit under Title 35, United States Code, Section 120 of any United States application(s) 
listed below and, insofar as the subject matter of each of the claims of this application is not disclosed in the prior 
United States application in the manner provided by the first paragraph of Title 35, United States Code, Section 
1 12, 1 acknowledge the duty to disclose material information as defined in Title 37, Code of Federal Regulations, 
Section 1.56(a) which occurred between the filing date of the prior application and the national or PCT 
international filing date of this application: 

Serial Number Filing Date Status (patented/pending/abandoned) 
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A ) 
) 



Attorney^ Jbket No.:SGI 15-4-458.00 
) 



Power of Attorney 

As a named inventor, I hereby appoint the following attorney(s) and/or agent(s) to prosecute this application and 
transact all business in the Patent Trademark Office connected therewith. 



.i?S?.5....?..:,..i?£ 

.John _P . ...Wagner 

Glenn D . .Barnes 

.WiM£S.4..„9.:„..M ] ES 

Chris B^rne 

..^2feS....?.?.i9^.?S. 

Send Correspondence to: 



Registration No.: ,..3 6 , 39 8 
Registration No.: , J.5.,,29 5 
Registration No.: 3 5 , 3 98 
Registration No.: P-.42 , 293 
Registration No.: t jp-41 , ,923. 
Registration No.: ...3 8,, 3 3 0 
Registration No.: 3.2. ; 2 04 
Registration No.: 34 , 62 5 
Registration No.: 40, 53 0 



Wagner, murabito & hao 

Two North Market Street, Third Floor 
San Jose, California 951 13 
(408) 938-9060 



Signatures 

I hereby declare that all statements made herein of my own knowledge are true and that all statements made on 
information and belief are believed to be true; and further that these statements were made with the knowledge 
that willful false statements and the like so made are punishable by fine or imprisonment, or both, under Section 
1001 of Title 18 of the United States-Code and that such willful false statements may jeopardize the validity of the 
application or any patent issued thereon. 

Full Name of Sole/First Inventor: Timothy van Hook 



Inventor's Signature Date 

Residence Atherton, California Citizenship usa 

"(City State) ' 

P.O. Address 2 24 _ Oakgr , ove ...Avenue ; At 94027 



Full Name of Second/Joint Inventor: Peter Hsu 



Inventor's Signature Date 
Residence Fremont , California "'cTtizensWp 

"(City Water" 
P.O. Address ..A?jj.3..#Si!£.£^^ 9 45 55 

Full Name of Third/Joint Inventor: William a. Huffman 



Inventor's Signature Date 

Residence ...Los.Gatos, California Cftizenship USA 

(City siaiej 

P.O. Address ...16205..Roseleaf Lane, Los Gatos, California 95032 
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Attorne('")DCket No.:SGI 15-4-458.00 

: ) 



Full Name of Fourth/Joint Inventor: Henry; p . Moraton 

Inventor's Signature ^^M^^... Date 

Residence woodsidie,vffialifornia Citizenship usa 

"(City State) 
P.O. Address ....y£„P£i^ ?M?£?JL§1$ 

Full Name of Fifth/Joint Inventor: Earl a. Killian 



Inventor's Signature Date 

Residence „Los m Altos Hills, California ""cltlMnship USA 

"""(City State) 
P.O. Address _27?£.L^^^ Los Altos Hills, California 94022 
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declaration and Power of AtWrney 
for a Patent Application 



Attorne/~)cket No.:SGI 15-4-458.00 

") 



Declaration 



As below named inventor, I hereby declare that my residence post office address, and citizenship are as stated 
below my name. Further, I hereby declare that I believe I am the original, first and sole inventor (if only one name is 
listed below) or an original, first and joint inventor (if plural names are listed below) of the subject matter which is 
claimed and for which a patent is sought on the invention entitled: 

A METHOD FOR PROVIDING EXTENDED PRECISION IN SIMD VECTOR ARITHMETIC OPERATIONS 

the specification of which: 



is attached hereto, or 
was filed on ( 10 / 9 / 97„ 
was amended on " 



as application serial no. 08/947, 648 : and 



I hereby state that I have reviewed and understand the contents of the above identified specification, including 
the claims, as amended by any amendment referred to above; and 



I acknowledge the duty to disclose information which is material to the examination of this application in 
accordance with Title 37, Code of Federal Regulations, Section 1.56(a), 



Foreign Priority Claim 

I hereby claim foreign priority benefits under Title 35, United States Code Section 119 of any foreign application(s) 
for patent or inventor's certificate listed below and have also identified below any foreign application for patent or 
inventor's certificate having a filing date before that of the application on which priority is claimed: 

Number Country Date Filed Priority Claimed 

.,.,„ ; yes no 

; yes no 



U.S. Priority Claim 

I hereby claim the benefit under Title 35, United States Code, Section 120 of any United States appiication(s) 
listed below and, insofar as the subject matter of each of the claims of this application is not disclosed in the prior 
United States application in the manner provided by the first paragraph of Title 35, United States Code, Section 
1 12, 1 acknowledge the duty to disclose material information as defined in Title 37, Code of Federal Regulations, 
Section 1.56(a) which occurred between the filing date of the prior application and the national or PCT 
international filing date of this application: 

Serial Number Filing Date Status (patented/pending/abandoned) 
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Power of Attorney 



As a named inventor, I hereby appoint the following attorney(s) and/or agent(s) to prosecute this application and 
transact all business in the Patent Trademark Office connected therewith. 

.James mm p . m( Hao Registration No,: 36 1 , 398 

Registration No.: 1 35 , 295 



..^.9!^„..?..:....y^9I}.?.?r. Registration No.: ...35 ti# 398 

Glenn _ D . _ .Barnes Registration No.: ...P- t 42 , 2 93 

^ Wilfred .h . _ Lam Registration No.: J^jJM?.! 

Steve weiner Registration No.: 38 , 330 



..!r.!J£i.5..j.S:5? Registration No.: 32 , 204 

Irene Fernandez Registration No.: 34,625 



.John ...Brigden^ Registration No.: ...40, 530 

Send Correspondence to: 

WAGNER, MURABITO & HAO 

Two North Market Street, Third Floor 
San Jose, California 95113 
(408) 938-9060 



Signatures 

I hereby declare that ail statements made herein of my own knowledge are true and that all statements made on 
information and belief are believed to be true; and further that these statements were made with the knowledge 
that willful false statements and the like so made are punishable by fine or imprisonment, or both, under Section 
1001 of Title 18 of the United States Code and that such willful false statements may jeopardize the validity of the 
application or any patent issued thereon. 

Full Name of Sole/First Inventor: Timothy van Hook 



Inventor's Signature Date 

Residence Atherton, _ California Citizenship usa 

'(City * State) " 
P.O. Add res S ... 2 .? 4... Oakgr oy e 94 0 2 1 

Full Name of Second/Joint Inventor: Peter Hsu 



Inventor's Signature _ - Date 

Residence ..XSSKSIit. /...»? a l i ^ ornia Citizenship 

'"'"(City Staiej ■" 
P.O. Address ...2853, .Welk Co^on 94555 

Full Name of Third/Joint Inventor: William A. Huffman 



Inventor's Signature Date 

Residence ...j^os , Gato^ Citizenship usa ; 

"(City" State) 

P.O. Address 16205 Roseleaf Lane, Los Gatos, California 95032 
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Full Name of Fourth/Joint Inventor: Henry; P . Moreton 



Inventor's Signature Date 

Residence Woods ide, (California Citizenship USA 

(City State) 
P.O. Address M1 i!0 M Phi.^ ? M 19.§.?.:.?..1?..5.. 

Full Name of Fifth/Joint Inventor: Earl a. Killian 



Inventor's Signature Z^C^.^ Jt^^U^. Date l2.!(^f^hJSl£. 

Residence . .fegs.. Aicos^^ Citizenship usa 

"(City State) 

P.O. Address 27961 Central Drive, Los Altos Hills, California 94022 



Page 3 of 3 



rev. 793 dbp 



